GenCore version 5.1.6 
Copyright (c) 1993 - 2005 Corapugen Ltd. 



OM protein - protein search, using sw model 
Run on: 



Title: 

Perfect score: 
Sequence: 



January 7, 2005, 11:40:59 ; Search time 64.8455 Seconds 

(without alignments) 
1864.305 Million cell updates/sec 

US-10-726-721A-9 
1715 

1 RGDVDDAGDCSGARYNDWSD VKVGKFMAKLAEHMFPKSQE 337 



2002273 



Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 

Searched: 2002273 seqs, 358729299 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 

Database : A_Geneseq_23Sep04 : * 

1: geneseqpl980s : * 

2: geneseqpl990s : * 

3: geneseqp2000s : * 

4: geneseqp2001s : * 

5: geneseqp2002s : * 

6: geneseqp2003as : * 

7: geneseqp2003bs : * 

8: geneseqp2004s : * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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Result 



% 

Query 



No. 


Score 


Match 


Length DB 


ID 


Description 


1 


1713 


99.9 


337 


4 


AAG67776 


Aag67776 Amino aci 
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1708 


99.6 


342 
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AAB27240 


Aab2724 0 Human EXM 
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1708 


99.6 


342 
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ADJ69028 


Adj69028 Human hea 
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1708 


99.6 


379 
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Aaul2279 Human PRO 
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1708 


99.6 
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4 


AAB68359 


Aab68359 Amino aci 
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99.6 
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AAM40083 


Aam40083 Human pol 
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1708 


99.6 
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AB017723 


Abol7723 Novel hum 
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1708 


99.6 


379 


6 


ABU80977 


Abu80977 Human PRO 
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1708 


99.6 
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ABU66677 
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10 
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Novel 
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11 


1708 


99. 
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AB024948 


Abo24948 
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1708 


99. 
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6 


ABU66953 


Abu66953 


Human 


sec 
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1708 


99. 
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6 
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Ada45735 


Novel 
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14 


1708 


99. 


6 


379 


6 


ADA76166 


Ada76166 


Human 


PRO 


15 


1708 


99. 


6 


379 


6 


ADA18816 


Adal8816 


Human 


PRO 


16 


1708 


99. 


6 


379 


6 


ADA61439 


Ada61439 


Homo sapi 


17 


1708 


99. 


6 


379 


6 


ADB19224 


Adbl9224 


Novel 


hum 


18 


1708 


99. 


6 


379 


6 


ADB27765 


Adb27765 


Human 


PRO 


19 


1708 


99. 


6 


379 


6 


ADA86244 


Ada86244 


Novel 


hum 


20 


1708 


99. 


6 


379 


6 


ADB15808 


Adbl5808 


Human 


PRO 


21 


1708 


99. 


6 


379 


6 


ADA47594 


Ada47594 


Human 


PRO 


22 


1708 


99. 


6 


379 


6 


ADA67389 


Ada67389 


Human 


PRO 


23 


1708 


99. 


6 


379 


6 


ADB30396 


Adb30396 


Human 


PRO 


24 


1708 


99. 


6 


379 


6 


ADA85692 


Ada85692 


Novel 


hum 


25 


1708 


99. 


6 


379 


6 


ADA96904 


Ada96904 


Human 


PRO 


26 


1708 


99. 


6 


379 


6 


ADA79208 


Ada79208 


Human 


PRO 


27 


1708 


99. 


6 


379 


6 


ADA87347 


Ada87347 


Novel 


hum 


28 


1708 


99. 


6 


379 


6 


ADB16549 


Adbl6549 


Human 


PRO 


29 


1708 


99. 


6 


379 


6 


ADA91641 


Ada91641 


Novel 


hum 


30 


1708 


99. 


6 


379 


6 


ADB14704 


Adbl4704 


Human 


PRO 


31 


1708 


99. 


6 


379 


6 


ADB18665 


Adbl8665 


Novel 


hum 


32 


1708 


99. 


6 


379 


6 


ADA93880 


Ada93880 


Human 


PRO 


33 


1708 


99. 


6 


379 


6 


ADB19776 


Adbl9776 


Novel 


hum 


34 


1708 


99. 


6 


379 


6 


ADB13088 


Adbl3088 


Human 


PRO 


35 


1708 


99. 


6 


379 


6 


AB043256 


Abo43256 


Novel 


hum 


36 


1708 


99. 


6 


379 


6 


ADA74342 


Ada74342 


Human 


PRO 


37 


1708 


99. 


6 


379 


6 


ADB24575 


Adb24575 


Human 


PRO 


38 


1708 


99. 


6 


379 


6 


ADA82099 


Ada82099 


Human 


PRO 


39 


1708 


99. 


6 


379 


6 


ADA75062 


Ada75062 


Human 


PRO 


40 


1708 


99. 


6 


379 


6 


ADA85140 


Ada85140 


Novel 


hum 


41 


1708 


99. 


6 


379 


6 


ADA84588 


Ada84588 


Novel 


hum 


42 


1708 


99. 


6 


379 


6 


ADB29844 


Adb29844 


Human 


PRO 


43 


1708 


99. 


6 


379 


6 


ADA80372 


Ada80372 
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PRO 


44 


1708 


99. 
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379 


6 


ADA75614 
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Human 


PRO 
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1708 


99. 


6 


379 


6 


ADA46839 


Ada46839 


Human 


PRO 



ALIGNMENTS 



RESULT 1 
AAG67776 

ID AAG67776 standard; protein; 337 AA. 
XX 

AC AAG67776; 
XX 

DT 10-DEC-2001 (first entry) 
XX 

DE Amino acid sequence of a human FE65 binding PTB1 domain protein. 
XX 

KW Human; phosphotyrosine binding domain 1; PTB1 domain; FE65; beta-amyloid; 

KW Alzheimer's disease; FEB PI ; FE65 binding PTB1 domain protein; hnRNPL; 

KW neurodegenerative disease. 
XX 

OS Homo sapiens . 
XX 



FH Key Location/Qualifiers 

FT Misc-dif ference 305 

FT /note= "unspecified residue encoded by GNT" 
XX 

PN WO200159104-A1. 
XX 

PD 16-AUG-2001. 
XX 

PF 07-FEB-2001; 2001WO-FR000361 . 
XX 

PR 10-FEB-2000; 2000FR-00001628 . 

PR 18-APR-2000; 2000US-0198500P . 
XX 

PA (AVET ) AVENTIS PHARMA SA. 
XX 

PI Maury I, Mercken L, Fournier A; 
XX 

DR WPI; 2001-589717/66. 

DR N-PSDB; AAH78615. 
XX 

PT Compound capable of modulating interaction between the PTB1 domain of 

PT FE65 protein and hnRNPL and/or FEBP1 protein, useful to treat 

PT neurological disorders including Alzheimer f s disease. 
XX 

PS Claim 9; Page 42-43; 51pp; French. 
XX 

CC The present sequence represents a human FEBP1 (FE65 binding PTB1 domain 

CC protein) . The protein is a partner of the human FE65 protein. FE65 is 

CC implicated in the production of beta-amyloid. Partners of the FE65 

CC protein thus represent novel targets for the treatment of Alzheimer's 

CC disease. Such partners include FEBP1 and hnRNPL (undefined) . Compounds 

CC which are capable of at least partially modulating interactions between 

CC hnRNPL and/or FEBP1 proteins or their homologues and the phosphotyrosine 

CC binding domain 1 (PTB1) domain of FE65 are used to treat 

CC neurodegenerative diseases. In particular, they are used for treating 

CC Alzheimer's disease 

XX 

SQ Sequence 337 AA; 

Query Match 99.9%; Score 1713; DB 4; Length 337; 
Best Local Similarity 100.0%; Pred. No. 2.2e-162; 

Matches 337; Conservative 0; Mismatches 0; Indels 0; Gaps ' 0; 

Qy ' 1 RGDVT)DAGDCSGARYNDWSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRAR 60 

I II I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1 RGDVDDAGDCSGARYNDWSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRAR 60 



Qy 61 RAVQKRAS PNS DDTVLS PQELQKVLCLVEMSEKPYI LEAALI ALGNNAAYAFNRDI I RDL 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II I 

Db 61 RAVQKRAS PNS DDT VXSPQELQKVXCLV^IMSEKPYI LEAALI ALGNNAAYAFNRDI I RDL 120 

Qy 121 GGLPIVAKILNTRDPIVICEKALIVXNNLSWAENQRRLKVYMNQVCDDTITSRLNSSVQL 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 121 GGLPIVAKILNTRDPIVT^EKALIVTiNNLSWAENQRRLKVTMNQVCDDTITSRLNSSVQL 180 



QY 



181 AGLRLLTNMTVTNEYQHMLANSISDFFRLFSAGNEETKLQVLKLLLNLAENPAMTRELLR 240 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II 



Db 181 AGLRLLT^^4TVTNEYQHMLANSISDFFRLF 240 



Qy 241 AQVPSSLGSLFNKKENKEVILKLLVIFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQV 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 241 AQVPSSLGSLFNKKENKEVILKLLVIFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQV 300 

Qy 301 CADKXLGIESHHDFLVKVKVGKFMAKLAEHMFPKSQE 337 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 CADKXLGIESHHDFLVKVKVGKFMAKLAEHMFPKSQE 337 



RESULT 2 
AAB27240 

ID AAB27240 standard; protein; 342 AA. 
XX 

AC AAB27240; 
XX 

DT 27-MAR-2001 (first entry) 
XX 

DE Human EXMAD-18 SEQ ID NO: 18. 
XX 

KW Extracellular matrix and adhesion-associated protein; EXMAD; cancer; 

KW inflammation; reproductive disorder; cardiovascular disorder; 

KW immune disorder; musculoskeletal disorder; developmental disorder; 

KW gastrointestinal disorder; cell proliferation disorder. 

XX 

OS Homo sapiens. 
XX 

PN WO200068380-A2. 
XX 

PD 16-NOV-2000. 
XX 

PF 10-MAY-2000; 2000WO-US012811 . 
XX 

PR ll-MAY-1999; 99US-0133643P . 

PR 23-AUG-1999; 99US-0150409P . 
XX 

PA (INCY-) INCYTE GENOMICS INC. 
XX 

PI Bandman O, Hillman JL, Tang YT, Lai P, Yue H, Baughn MR, Lu DAM; 

PI Azimzai Y; 

XX 

DR WPI; 2001-007395/01. 

DR N-PSDB; AAC66907. 
XX 

PT Isolated polynucleotide encoding extracellular matrix or adhesion- 

PT associated protein (EXMAD) useful for diagnosing, treating, or preventing 

PT disorders associated with expression of EXMAD such as proliferative, 

PT immune and genetic disorders. 

XX 

PS Claim 1; Page 105-106; 129pp; English. 
XX 

CC The present invention provides the protein and coding sequences for 25 

CC novel extracellular matrix and adhesion-associated proteins (EXMADs). 

CC These are designated EXMAD- 1, EXMAD- 2, EXMAD- 3, EXMAD- 4 , EXMAD- 5, EXMAD - 

CC 6, EXMAD- 7 , EXMAD- 8 , EXMAD- 9, EXMAD-10, EXMAD-11, EXMAD- 12 , EXMAD- 13, 

CC EXMAD- 14 , EXMAD-15, EXMAD- 16, EXMAD- 17, EXMAD-18, EXMAD- 19 , EXMAD-20, 



CC EXMAD-21, EXMAD-22, EXMAD-23, EXMAD-24 and EXMAD-25. They are useful in 

CC the prevention and treatment of cancers, cell proliferation, 

CC cardiovascular, reproductive, immune, musculoskeletal, developmental and 

CC gastrointestinal disorders and inflammation 

XX 

SQ Sequence 342 AA; 



Query Match 99.6%; Score 1708; DB 4; Length 342; 

Best Local Similarity 99.7%; Pred. No. 7.2e-162; 

Matches 335; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 



Qy 


2 


GDVDDAGDCSGARYNDWSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRARR 


61 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


7 


GDVT)DAGDCSGARYNDWSDDDDDSNESKSIWYPPWARIGTEAGTRARARARARATRARR 


66 


Qy 


62 


AVQKRAS PNS DDTVLS PQELQKVLCLVEMS EKP YI LEAALI ALGNNAAYAFNRDI I RDLG 


121 






1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


67 


AVQKRAS PNS DDTVLS PQELQKVLCLVEMS EKP YI LEAALI ALGNNAAYAFNRDI I RDLG 


126 


Qy 


122 


GLPIVAKILNTRDPIVTCEKALIVXNNLSWAENQRRLKVYM 


181 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


127 


GLPIVAKILNTRDPIVKEKALIVliNNLSWAENQRRLKVYM^ 


186 


Qy 


182 


GLRLLTNMTWNEYQHMLANSISDFFRLFSAGNEETKLQVXKL 


241 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 II 1 




Db 


187 


GLRLLTNMTVTNEYQHMLANS I SDFFRLFSAGNEETKLQVLKLLLNLAENPAMTRELLRA 


246 


Qy 


242 


QVPSSLGSLFNKKENKEVILKLLVTFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 


301 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 II 




Db 


247 


QVPSSLGSLFNKKENKEVTLKLLVTFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 


306 


Qy 


302 


ADKXLGI ESHHDFLVKVKVGKFMAKLAEHMFPKSQE 337 








III 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


307 


ADKVLGI ESHHDFLVKVKVGKFMAKLAEHMFPKSQE 342 





RESULT 3 
ADJ69028 

ID ADJ69028 standard; protein; 342 AA. 
XX 

AC ADJ69028; 
XX 

DT 06-MAY-2004 (first entry) 
XX 

DE Human heat mitochondrial protein as a therapeutic target SeqID834. 
XX 

KW mitochondrial; human; screening assay; diabetes mellitus; 

KW Huntington's disease; osteoarthritis; 

KW Leber's hereditary optic neuropathy; LHON; 

KW mitochondrial encephalopathy lactic acidosis and stroke; MELAS; 

KW myoclonic epilepsy ragged red fibre syndrome; MERRF; cancer; 

KW neuroprotective; nootropic; antidiabetic; anticonvulsant; antiarthritic; 

KW osteopathic; ophthalmological ; cytostatic. 

XX 

OS Homo sapiens. 
XX 

PN WO2003087768-A2. 



PD 23-OCT-2003. 
XX 

PF 04-APR-2003; 2003WO-US010870 . 
XX 

PR 12-APR-2002; 2002US-0372843P . 

PR 17-JUN-2002; 2002US-0389987P . 

PR 20-SEP-2002; 2002US-0412418P . 
XX 

PA (MITO-) MITOKOR. 

PA (BUCK-) BUCK INST AGE RES. 

XX 

PI Ghosh SS, Fahy ED, Zhang B, Gibson BW, Taylor SW, Glenn GM; 

PI Warnock DE; 

XX 

DR WPI; 2003-845369/78. 
XX 

PT Identifying a mitochondrial target for drug screening assays and for 

PT treating diseases associated with altered mitochondrial function, 

PT comprises detecting a modified polypeptide in a sample and correlating 

PT with the disease. 

XX 

PS Claim 1; SEQ ID NO 834; 180pp; English. 
XX 

CC This invention relates to novel mitochondrial targets that can be used 

CC for therapeutic intervention in treating a disease associated with 

CC altered mitochondrial function. Specifically, it refers to a method for 

CC identifying proteins of the human heart mitochondrial proteome that are 

CC useful for drug screening assays, as well as therapeutic targets. The 

CC present invention describes a method for identifying such proteins that 

CC can be used in the treatment of various diseases associated with altered 

CC mitochondrial function including diabetes mellitus, Huntington's disease, 

CC osteoarthritis, Leber's hereditary optic neuropathy (LHON) , mitochondrial 

CC encephalopathy lactic acidosis and stroke (MELAS) , myoclonic epilepsy 

CC ragged red fibre syndrome (MERRF) or cancer. Accordingly, these 

CC compositions have neuroprotective, nootropic, antidiabetic, 

CC anticonvulsant, antiarthritic, osteopathic, ophthalmological and 

CC cytostatic activities. This polypeptide sequence is a human heart 

CC mitochondrial protein of the invention. 

XX 

SQ Sequence 342 AA; 

Query Match 99.6%; Score 1708; DB 7; Length 342; 
Best Local Similarity 99.7%; Pred. No. 7.2e-162; 

Matches 335; Conservative 0; Mismatches 1; Indels 0; Gaps 0 

Qy 2 GDVDDAGDCSGARYNDWSDDDDDSNESKSIWYPPWARIGTEAGTRARARARARATRARR 61 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 7 GDVTDDAGDCSGARYNDWSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRARR 66 

Qy 62 AVQKRAS PNS DDTVLS PQELQKVLCLVEMS EKP YI LEAALI ALGNNAAYAFNRDI I RDLG 121 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 67 AVQKRAS PNS DDTVLS PQELQKVLCLVEMS EKP YI LEAALI ALGNNAAYAFNRDI I RDLG 126 



Qy 

Db 



122 GLP I VAKI LNTRDP I VKEKALI VXNNLS WAENQRRLKVTMNQVCDDT ITS RLNS SVQLA 181 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 
127 GLPIVAKILNTRDPIWEKALIVXNNLSWAENQRRLKVYMNQVCDDTITSRLNSSVQIA 186 



Qy 182 GLRLLTNMTWNEYQHMIANSISDFFRLFSAGNEETKLQVLK^^ 241 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 187 GLRLLTMtfTVTNEYQHMLANSISDFFRLFSAGNEETK^^ 246 

Qy 242 QVPSSLGSLFNKKENKEVILKLLVIFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 301 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 247 QVPSSLGSLFNKKENKEVTLKLLVIFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 306 

Qy 302 ADKXLGI ESHHDFLVKVKVGKFMAKLAEHMFPKSQE 337 

III I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 
Db 307 ADKVLGI ESHHDFLVKVKVGKFMAKLAEHMFPKSQE 342 



RESULT 4 
AAU12279 

ID AAU12279 standard; protein; 379 AA. 
XX 

AC AAU12279; 
XX 

DT 24-OCT-2001 (first entry) 
XX 

DE Human PRO6007 polypeptide sequence. 
XX 

KW Human secretory and transmembrane; PRO; mammalian; cancer; lung; breast; 

KW prostate; cervical; tumour necrosis factor-alpha; TNF-alpha; cartilage; 

KW ear; proliferation; glucose; free fatty acid; skeletal muscle; adipocyte; 

KW A-peptide; factor VTIA; gene therapy. 
XX 

OS Homo sapiens. 
XX 



PN 


WO200140466- 


-A2. 




XX 












PD 


07- 


JUN- 


2001. 




XX 












PF 


01- 


DEC- 


2000; 2000WO- 


-US032678. 


XX 












PR 


01- 


DEC- 


1999, 


? 99WO- 


-US028301. 


PR 


01- 


DEO 


1999, 


; 99WO- 


-US028634. 


PR 


02- 


DEC- 


1999, 


99WO- 


-US028551. 


PR 


02- 


DEC- 


1999, 


99WO- 


-US028564. 


PR 


02- 


DEC- 


1999, 


99WO- 


-US028565. 


PR 


09- 


DEC- 


1999, 


99US- 


-0170262P. 


PR 


16- 


DEC- 


1999, 


f 99WO- 


-US030095. 


PR 


20- 


DEC- 


1999, 


? 99WO- 


-US030911. 


PR 


20- 


DEC- 


1999, 


? 99WO- 


-US030999. 


PR 


30- 


DEC- 


1999, 


; 99WO- 


-US031243. 


PR 


30- 


DEC- 


1999, 


? 99WO- 


-US031274. 


PR 


05- 


JAN- 


2000, 


; 2000WO- 


-US000219. 


PR 


06- 


JAN- 


2000, 


; 2000WO- 


-US000277. 


PR 


06- 


JAN- 


2000 


; 2000WO- 


-US000376. 


PR 


11- 


FEB- 


2000 


; 2000WO- 


-US003565. 


PR 


18- 


FEB- 


2000 


? 2000WO- 


-US004341. 


PR 


18- 


FEB- 


2000 


f 2000WO- 


-US004342. 


PR 


22- 


•FEB- 


2000 


; 2000WO- 


-US004414. 


PR 


24- 


•FEB- 


2000 


? 2000WO- 


-US004914 . 


PR 


24- 


FEB- 


2000 


? 2000WO- 


-US005004. 



PR 


01 


-MAR- 


•2000; 


2000WO- 


-US005601. 


PR 


02 


-MAR- 


•2000; 


2000WO- 


-US005841. 


PR 


03 


-MAR- 


2000; 


2000US- 


-0187202P. 


PR 


10 


-MAR- 


•2000; 


2000WO- 


-US006319. 


PR 


15 


-MAR- 


-2000; 


2000WO- 


-US006884. 


PR 


20 


-MAR- 


•2000; 


2000WO- 


-US007377. 


PR 


21 


-MAR- 


-2000; 


2000WO- 


-US007532. 


PR 


30 


-MAR- 


-2000; 


2000WO- 


-US008439. 


PR 


17 


-MAY- 


•2000; 


2000WO- 


-US013705. 


PR 


22 


-MAY- 


•2000; 


2000WO- 


-US014042. 


PR 


30 


-MAY- 


•2000; 


2000WO- 


-US014941. 


PR 


02 


-JUN- 


2000; 


2000WO- 


-US015264. 


PR 


05 


-JUN- 


•2000; 


2000US- 


-0209832P. 


PR 


28 


-JUL- 


•2000; 


2000WO- 


-US020710. 


PR 


11 


-AUG- 


-2000; 


2000WO- 


-US022031. 


PR 


23 


-AUG- 


-2000; 


2000WO- 


-US023522. 


PR 


24 


-AUG- 


•2000; 


2000WO- 


-US023328, 


PR 


08 


-NOV- 


-2000; 


2000WO- 


-US030952. 


PR 


10 


-NOV- 


2000; 


2000WO- 


-US030873. 



XX 

PA (GETH ) GENENTECH INC. 
XX 

PI Baker KP, Beresini M, Deforge L, Desnoyers L, Filvaroff E, Gao W; 

PI Gerritsen ME, Goddard A, Godowski PJ, Gurney AL, Sherwood S; 

PI Smith V, Stewart TA, Tumas D, Watanabe CK, Wood WI, Zhang Z; 
XX 

DR WPI; 2001-408281/43. 

DR N-PSDB; AAS21351. 
XX 

PT Isolated , secretory and transmembrane PRO polypeptide used to detect 

PT other PRO polypeptides, link bioactive molecules to cells expressing PRO 

PT polypeptides, and detect the presence of mammalian tumors e.g. lung, 

PT breast, prostate, cervical. 
XX 

PS Claim 12; Fig 216; 813pp; English. 
XX 

CC AAU12172-AAU12446 represent novel human secretory and transmembrane PRO 

CC polypeptides. The PRO polypeptides are useful to detect other PRO 

CC polypeptides, to link bioactive molecules to cells expressing PRO 

CC polypeptides, to modulate biological activities of cells expressing PRO 

CC polypeptides, and to detect the presence of mammalian lung, colon, 

CC breast, prostate, rectal, cervical or liver tumours by comparing PRO 

CC polypeptide expression in a cell sample to that in a control sample. Some 

CC of the 275 sequences are also useful to stimulate the release of tumour 

CC necrosis factor-alpha (TNF-alpha) from human blood, the proliferation or 

CC differentiation of chondrocytes, the proliferation or gene expression in 

CC pericyte cells, the release of proteoglycans from cartilage, the 

CC proliferation of inner ear utricular supporting cells or of T- 

CC lymphocytes, the release of a cytokine from peripheral blood monocytes 

CC (PBMCs), or the proliferation of endothelial cells. Some of the PRO 

CC polypeptides may modulate glucose or free fatty acid uptake by skeletal 

CC muscle cells or by adipocytes; or inhibit binding of A-peptide to factor 

CC VILA. The PRO polypeptides can be used in assays to identify molecules 

CC involved in binding interactions . The polynucleotides encoding PRO 

CC polypeptides can be used to generate probes, antisense RNA/DNA, 

CC transgenic or knock out animals and can be used in gene therapy 



SQ Sequence 379 AA; 



Query Match 99.6%; Score 1708; DB 4; Length 379; 

Best Local Similarity 99.7%; Pred. No. 8.3e-162; 

Matches 335; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 



Qy 2 GDVDDAGDCSGARYNDWSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRARR 61 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 44 GDV1)DAGDCSGARYNDWSDDDDDSNESKSIWYPPWARIGTEAGTRARARARARATRARR 103 

Qy 62 AVQKRAS PNSDDTVLSPQELQKVLCLVEMSEKPYI LEAALIALGNNAAYAFNRDI I RDLG 121 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 104 AVQKRAS PNSDDTVljSPQELQKVLCLVEMSEKPY I LEAALIALGNNAAYAFNRDI I RDLG 163 

Qy 122 GLP I VAKI LNTRDP I WEKALI VXNNLS WAENQRRLKVYMNQVCDDT I TS RLNS SVQLA 181 

I I I I I I II I t I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I It I I I I I II I I I I I I 
Db 164 GLPIVAKILNTRDPIVKEKALIVTiNNLSWAENQRRLKWMNQVCDDTITSRLNSSVQ^ 223 

Qy 182 GLRLLTNMTVTNEYQHMLANSISDFFRLFSAGNEETKLQVljKLLLNLAENPAMTRELLRA 241 

I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 
Db 224 GLRLLTNMTWNEYQHMLANSISDFFRLFSAGNEETKLQVXKLLI^LAENPAMTRELLRA 283 

Qy 242 QVPSSLGSLFNKKENKEVILKLLVTFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 301 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 284 QVPSSLGSLFNKKENKEVTLKLLVTFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 343 

Qy 302 ADKXLGIESHHDFLVKVKVGKFMAKLAEHMFPKSQE 337 

III I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 
Db 344 ADKVIjGIESHHDFLWVKVGKFMAKLAEHMFPKSQE 379 



RESULT 5 
AAB68359 

ID AAB68359 standard; protein; 379 AA. 
XX 

AC AAB68359; 
XX 

DT 09-JUL-2001 (first entry) 
XX 

DE Amino acid sequence of a human IkappaB kinase binding protein Y2H35. 
XX 

KW IkappaB kinase binding protein; IKK binding protein; Y2H35; inflammation; 

KW apoptosis; inflammatory mediator. 

XX 

OS Homo sapiens. 
XX 

PN US6214582-B1. 
XX 

PD 10-APR-2001. 
XX 

PF 16-NOV-1998; 98US-00193266 . 
XX 

PR 16-NOV-1998; 98US-00193266 . 
XX 

PA (UYNY ) UNIV NEW YORK STATE RES FOUND. 
XX 

PI Marcu KB; 



XX 

DR WPI; 2001-315460/33. 

DR N-PSDB; AAF85219, AAF85220. 

XX 

PT Novel isolated nucleic acid molecule encoding isolated IkB kinase binding 

PT protein designated Y2H35, useful as probes and primers in molecular 

PT biology and biotechnology. 
XX 

PS Disclosure; Col 11-14; lOpp; English. 
XX 

CC The present sequence represents a human IkappaB kinase (IKK) binding 

CC protein, designated Y2H35. Fragments of Y2H35 polynucleotide are useful 

CC as probes and primers in molecular biology and biotechnology. The Y2H35 

CC protein is useful for elucidating and controlling pathways leading to 

CC inflammation and apoptosis, for detecting IKK complexes and modulating 

CC IKK activity in cells undergoing signalling by inflammatory mediator such 

CC as tumour necrosis factor alpha (TNFalpha) and interleukin-1 (IL-1), and 

CC for identifying therapeutically active agents that modulate the binding 

CC or interaction of Y2H35 and either IKKalpha or IKKbeta 
XX 

SQ Sequence 379 AA; 

Query Match 99.6%; Score 1708; DB 4; Length 379; 

Best Local Similarity 99.7%; Pred. No. 8.3e-162; 

Matches 335; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 2 GDVDDAGDCSGARYNDWSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRARR 61 

I I I I I I I I I I I II I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 44 GDVTIDAGDCSGARYNDWSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRARR 103 

Qy 62 AVQKRAS PNSDDTVLS PQELQKVLCLVEMS EKP YI LEAALI ALGNNAAYAFNRDI I RDLG 121 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 104 AVQKRAS PNS DDTVLS PQELQKVLCLVEMS EKP YI LEAALI ALGNNAAYAFNRDI I RDLG 163 

Qy 122 GLPIVAKILNTRDPIVKEKALIVLNNLSWAENQRRLKWMNQVCDDTITSRLNSSVQI^ 181 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 164 GLPIVAKILNTRDPIVTCEKALIVLNNLSWAENQRRLKVYMNQVCDDTITSRLNSSVQLA 223 

Qy 182 GLRLLTNMTA/TNEYQHMLANSISDFFRLFSAGNEETKLQVXKLLLNLAENPAMTRELL^ 241 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 224 GLRLLTNMTWNEYQHMLANSISDFFRLFSAGNEETKLQVljKLLLNLAENPAMTRELLRA 283 

Qy 242 QVPSSLGSLFNKKENKEVILKLLVI FENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 301 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 284 QVPSSLGSLFNKKENKEVILKLLVI FENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 343 

Qy 302 ADKXLGIESHHDFLVKVKVGKFMAKLAEHMFPKSQE 337 

J I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
Db 344 ADKVLGI ESHHDFLVKVKVGKFMAKLAEHMFPKSQE 379 

RESULT 6 
AAM40083 

ID AAM40083 standard; protein; 379 AA. 
XX 

AC AAM40083; 
XX 



DT 
XX 
DE 
XX 
KW 
KW 
KW 
KW 
KW 
KW 
XX 
OS 
XX 
PN 
XX 
PD 
XX 
PF 
XX 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
XX 
PA 
XX 

PI 
PI 
PI 

XX 
DR 
DR 
XX 
PT 
PT 
XX 
PS 
XX 

cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 



22-OCT-2001 (first entry) 

Human polypeptide SEQ ID NO 3228. 

Human; nootropic; immunosuppressant; cytostatic; gene therapy; cancer; 
peripheral nervous system; neuropathy; central nervous system; CNS; 
Alzheimer's; Parkinson's disease; Huntington's disease; haemostatic; 
amyotrophic lateral sclerosis; Shy-Drager Syndrome; chemotactic; 
chemokinetic; thrombolytic; drug screening; arthritis; inflammation; 
leukaemia. 

Homo sapiens. 

WO200153312-A1. 

26-JUL-2001. 

26-DEC-2000; 2000WO-US034263 . 



23-DEC- 
21-JAN- 
25-APR- 
20-JUN- 
19-JUL- 
03-AUG- 
14-SEP- 
19-OCT- 
29-NOV- 



1999; 
2000; 
2000; 
2000; 
2000; 
2000; 
2000; 
2000; 
2000; 



99US- 
2000US- 
2000US- 
2000US- 
2000US- 
2000US- 
2000US- 
2000US- 
2000US- 



00471275. 
00488725. 
00552317. 
00598042. 
00620312. 
00653450. 
00662191. 
00693036. 
00727344. 



(HYSE-) HYSEQ INC. 



Tang YT, Liu C, Asundi V, Chen R, Ma Y, Qian XB, Ren F, Wang D; 
Wang J, Wang Z, Wehrman T, Xu C, Xue AJ, Yang Y, Zhang J, Zhao QA; 
Zhou P, Goodrich R, Drmanac RT; 

WPI; 2001-442253/47. 
N-PSDB; AAI59239. 

Novel nucleic acids and polypeptides , useful for treating disorders such 
as central nervous system injuries. 

Example 5; SEQ ID NO 3228; 10078pp; English. 

The invention relates to human nucleic acids (AAI57798-AAI61369) and the 
encoded polypeptides (AAM38642-AAM42213) with nootropic, 

immunosuppressant and cytostatic activity. The polynucleotides are useful 
in gene therapy. A composition containing a polypeptide or polynucleotide 
of the invention may be used to treat diseases of the peripheral nervous 
system, such as peripheral nervous injuries, peripheral neuropathy and 
localised neuropathies and central nervous system diseases, such as 
Alzheimer's, Parkinson's disease, Huntington's disease, amyotrophic 
lateral sclerosis, and Shy-Drager Syndrome. Other uses include the 
utilisation of the activities such as: Immune system suppression, 
Activin/inhibin activity, chemotactic/ chemokinetic activity, haemostatic 
and thrombolytic activity, cancer diagnosis and therapy, drug screening, 
assays for receptor activity, arthritis and inflammation, leukaemias and 
C.N.S disorders. Note: The sequence data for this patent did not form 



CC part of the printed specification 
XX 

SQ Sequence 379 AA; 

Query Match 99.6%; Score 1708; DB 4; Length 379; 

Best Local Similarity 99.7%; Pred. No. 8.3e-162; 

Matches 335; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 



Qy 


2 


GDVTDDAGDCSGARYNDWSDDDDDSNESKSIWYPPWARIGTEAGTRARARARARATRARR 


61 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




UD 


A A 




J. U-3 


Qy 


62 


AVQKRASPNSDDTVLSPQELQKVLCLVEMSEKPYILEAALIALGNNAAYAFNRDIIRDLG 


121 






1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




nK 
UD 






X Dj 


Qy 


122 


GLPIVAKILNTRDPIVKEKALIVXNNLSWAENQRRLKVTMNQVCDDTITSRLN 


181 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 I 1 1 




Db 


164 


GLPIVAKILNTRDPIVTCEKALIVTiNNLSWAENQRRLKVTMNQVCDDTITSRI^ 


223 


Qy 


182 


GLRLLTNMTVTNEYQHMLANSISDFFRLFSAGNEET^ 


241 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 II 1 1 




Db 


224 


GLRLLT^^yITWNEYQHMLANSISDFFRLFSAGNEETKLQVLKLLLNLAENPAOT 


283 


Qy 


242 


QVPSSLGSLFNKKENKEVILKLLVIFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 


301 






1 II 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 IN 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


284 


QVPSSLGSLFNKKENKEVILKLLVIFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 


343 


Qy 


302 


ADKXLGI ESHHDFLVKVKVGKFMAKLAEHMFPKSQE 337 








III 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


344 


ADKVLGI ESHHDFLVKVKVGKFMAKLAEHMFPKSQE 379 





RESULT 7 
AB017723 

ID AB017723 standard; protein; 379 AA. 
XX 

AC AB017723; 
XX 

DT 26-AUG-2003 (first entry) 
XX 

DE Novel human secreted and transmembrane protein PRO6007. 
XX 



KW Human; secreted and transmembrane protein; PRO; antiinflammatory; 

KW antiarteriosclerotic; cardiant; anti-infertility; anti-HIV; cytostatic; 

KW antidiabetic; gene therapy; tumour necrosis factor (TNF) -alpha release; 

KW TNF-alpha release; cell proliferation; cell differentiation; 

KW gene expression modulator; proteoglycan release; cytokine release; 

KW tumour; inflammatory disease; organ failure; atherosclerosis; 

KW cardiac injury; infertility; birth defect; premature aging; AIDS; 

KW acquired immunodeficiency syndrome; cancer; diabetic complication; 

KW chromosome mapping; gene mapping; pharmaceutical; diagnostic; biosensor; 

KW bioreactor; tissue typing. 

XX 

OS Homo sapiens . 
XX 

PN US2003032156-A1. 



XX 








PD 


13 


-FEB- 


-2003 


XX 








PF 


06 


-MAY- 


2002 


XX 








PR 


31 


-MAR- 


1997 


PR 


12 


-JUN- 


1998 


PR 


14 


-JUL- 


1998 


PR 


28 


-AUG- 


1998 


PR 


10 


-SEP- 


1998 


PR 


14 


-SEP- 


1998 


PR 


14 


-SEP- 


1998 


PR 


14 


-SEP- 


1998 


PR 


16 


-SEP- 


1998 


PR 


17 


-SEP- 


1998 


PR 


07 


-OCT- 


1998 


PR 


29 


-OCT- 


1998 


PR 


29 


-OCT- 


1998 


PR 


20 


-NOV- 


1998 


PR 


01 


-DEC- 


1998 


PR 


05 


-JAN- 


1999 


PR 


08 


-MAR- 


1999 


PR 


10 


-MAR- 


1999 


PR 


20 


-APR- 


1999 


PR 


14 


-MAY- 


1999 


PR 


02 


-JUN- 


1999 


PR 


01 


-SEP- 


1999 


PR 


08 


-SEP- 


1999 


PR 


13 


-SEP- 


1999 


PR 


15 


-SEP- 


1999 


PR 


15 


-SEP- 


1999 


PR 


05 


-OCT- 


1999 


PR 


29 


-NOV- 


1999 


PR 


30 


-NOV- 


1999 


PR 


30 


-NOV- 


1999 


PR 


01 


-DEC- 


1999 


PR 


01 


-DEC- 


1999 


PR 


02 


-DEC- 


1999 


PR 


02 


-DEC- 


1999 


PR 


02 


-DEC- 


1999 


PR 


16 


-DEC- 


1999 


PR 


20 


-DEC- 


1999 


PR 


20 


-DEC- 


1999 


PR 


22 


-DEC- 


1999 


PR 


30 


-DEC- 


1999 


PR 


30- 


-DEC- 


1999. 


PR 


05 


-JAN- 


2000 


PR 


06 


-JAN- 


2000 


PR 


06 


-JAN- 


2000 


PR 


11 


-FEB- 


2000 


PR 


18 


-FEB- 


2000 


PR 


18 


-FEB- 


2000 


PR 


22 


-FEB- 


2000 


PR 


24 


-FEB- 


2000 


PR 


24 


-FEB- 


2000 


PR 


01 


-MAR- 


2000 


PR 


02 


-MAR- 


2000 



2002US-00140474. 

97WO-US005230. 

98WO-US012456. 

98WO-US014552. 

98WO-US017888. 

98WO-US018824. 

98WO-US019093. 

98WO-US019094. 

98WO-US019177. 

98WO-US019330. 

98WO-US019437. 

98WO-US021141. 

98WO-US022991. 

98WO-US022992. 

98WO-US024855. 

98WO-US025108. 

99WO-US000106. 

99WO-US005028. 

99WO-US005190. 

99WO-US008615. 

99WO-US010733. 

99WO-US012252. 

99WO-US020111. 

99WO-US020594. 

99WO-US020944. 

99WO-US021090. 

99WO-US021547. 

99WO-US023089. 

99WO-US028214. 

99WO-US028313. 

99WO-US028409. 

99WO-US028301. 

99WO-US028634. 

99WO-US028551. . 

99WO-US028564. 

99WO-US028565. 

99WO-US030095. 

99WO-US030911. 

99WO-US030999. 

99WO-US030720. 

99WO-US031243. 
. 99WO-US031274. 
2000WO-US000219. 
2000WO-US000277. 
2000WO-US000376. 
2000WO-US003565. 
2000WO-US004341. 
2000WO-US004342. 
2000WO-US004414. 
2000WO-US004914. 
2000WO-US005004^ 
2000WO-US005601. 
2000WO-US005746. 



PR 


02 


-MAR- 


•2000 


; 2000WO- 


-US005841. 


PR 


10 


-MAR- 


■2000 


? 2000WO- 


US006319. 


PR 


15 


-MAR- 


•2000 


; 2000WO- 


-US006884. 


PR 


20 


-MAR- 


■2000 


; 2000WO- 


-US007377. 


PR 


21 


-MAR- 


•2000 


; 2000WO- 


•US007532. 


PR 


30 


-MAR- 


•2000 


; 2000WO- 


-US008439. 


PR 


17 


-MAY- 


■2000 


? 2000WO- 


-US013705. 


PR 


22 


-MAY- 


•2000 


; 2000WO- 


•US014042. 


PR 


30 


-MAY- 


•2000, 


? 2000WO- 


•US014941. 


PR 


02 


-JUN- 


•2000, 


? 2000WO- 


•US015264. 


PR 


28 


-JUL- 


-2000, 


? 2000WO- 


-US020710. 


PR 


11 


-AUG- 


-2000, 


; 2000WO- 


-US022031. 


PR 


23 


-AUG- 


-2000, 


; 2000WO- 


-US023522. 


PR 


24 


-AUG- 


-2000, 


: 2000WO- 


-US023328. 


PR 


08 


-NOV- 


-2000, 


; 2000WO- 


•US030952. 


PR 


10 


-NOV- 


•2000, 


? 2000WO- 


-US030873. 


PR 


01 


-DEC- 


•2000, 


• 2000WO- 


-US032678. 


PR 


20- 


-DEC- 


•2000, 


• 2000US- 


-00747259. 


PR 


20 


-DEC- 


•2000, 


; 2000WO- 


-US034956. 


PR 


28 


-FEB- 


-2001, 


! 2001US- 


-00796498. 


PR 


28 


-FEB- 


-2001, 


• 2001WO- 


•US006520. 


PR 


01 


-MAR- 


•2001, 


- 2001WO- 


•US006666. 


PR 


09 


-MAR- 


-2001, 


; 2001US- 


00802706. 


PR 


14 


-MAR- 


-2001, 


• '2001US- 


-00808689. 


PR 


22 


-MAR- 


•2001, 


• 2001US- 


-00816744. 


PR 


05 


-APR- 


-2001, 


• 2001US- 


-00828366. 


PR 


10 


-MAY- 


-2001, 


; 2001US- 


•00854208. 


PR 


10 


-MAY- 


•2001, 


? 2001US- 


-00854280. 


PR 


18 


-MAY- 


-2001, 


? 2001US- 


-00860216. 


PR 


25 


-MAY- 


-2001, 


f 2001US- 


•00866028. 


PR 


25 


-MAY- 


•2001, 


• 2001US- 


•00866034. 


PR 


25 


-MAY- 


•2001, 


f 2001WO- 


-US017092. 


PR 


01 


-JUN- 


-2001, 


f 2001US- 


-00872035. 


PR 


01 


-JUN- 


-2001, 


i 2001WO- 


•US017800. 


PR 


05- 


-JUN- 


-2001, 


\ 2001US- 


•00874503. 


PR 


14- 


-JUN- 


•2001, 


• 2001US- 


•00882636. 


PR 


19 


-JUN- 


-2001, 


f 2001US- 


-00886342. 


PR 


20 


-JUN- 


-2001, 


f 2001WO- 


-US019692. 


PR 


21 


-JUN- 


-2001, 


f 2001US- 


•00887879. 


PR 


22 


-JUN- 


•2001, 


; 2001WO- 


-US020116. 


PR 


29 


-JUN- 


•2001, 


; 2001WO- 


-US021066. 


PR 


09 


-JUL- 


•2001, 


f 2001WO- 


-US021735. 


PR 


18 


-JUL- 


-2001, 


; 2001US- 


-00908827. 


PR 


06 


-AUG- 


-2001, 


? 2001US- 


-00924419. 


PR 


09 


-AUG- 


-2001 


? 2001US- 


•00927796. 


PR 


16 


-AUG- 


-2001 


? 2001US- 


-00931836. 


PR 


19 


-DEC- 


•2001, 


f 2001US- 


-00028072. 



XX 

PA (GETH ) GENENTECH INC. 
XX 

PI Baker KP, Beresini M, Deforge L, Desnoyers L, Filvaroff E, Gao W; 

PI Gerritsen ME, Goddard A, Godowski PJ, Gurney AL, Sherwood S; 

PI Smith V, Stewart TA, Tumas D, Watanabe CK, Wood WI, Zhang Z; 
XX 

DR WPI; 2003-341980/32. 

DR N-PSDB; ACD23960. 
XX 



PT New secreted and transmembrane PRO nucleic acids, for treating 

PT inflammation, organ failure, atherosclerosis, cardiac injury, 

PT infertility, birth defects, premature aging, axquired immunodeficiency 

PT syndrome (AIDS), or cancer. 

XX 

PS Claim 12; Fig 216; 660pp; English. 
XX 

CC The invention describes an isolated nucleic acid (I) comprising, or which 

CC has 80 % sequence identity to, or the full-length coding sequence of, one 

CC of 275 nucleotide sequences, and which encodes a corresponding 

CC polypeptide selected from 275 amino acid sequences, where all sequences 

CC are given in the specification. The polypeptide encoded by (I) is used to 

CC detect PRO polypeptides, link a bioactive molecule to a cell expressing a 

CC PRO polypeptide, modulate a biological activity of a cell, stimulate the 

CC release of tumour necrosis factor (TNF) -alpha from human blood, modulate 

CC the uptake of glucose or free fatty acid by cells, stimulate or inhibit 

CC the proliferation or differentiation of cells or gene expression, 
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PT New isolated PRO polypeptide useful for treating diabetes, rheumatoid 

PT arthritis, sports injuries, obesity, hearing loss in mammals, stroke, or 

PT heart attack. 
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CC The present invention relates to the isolation of novel human PRO 

CC polypeptides, and the polynucleotide sequences encoding them. The PRO 

CC polypeptides are secreted and transmembrane proteins. The PRO 

CC polypeptides and polynucleotides are useful for preparing a medicament 

CC useful in the treatment of diabetes, bone and/or cartilage disorders 

CC (e.g. rheumatoid arthritis, sports injuries, osteoarthritis), obesity, 

CC hyper- or hypo-insulinaemia, hearing loss, and coagulation disorders 

CC (e.g. stroke, heart attack) . Anti-PRO antibodies are useful in diagnostic 

CC assays for PRO, by detecting its expression in specific cells, tissues or 

CC serum, and for affinity purification of PRO from recombinant cell culture 

CC or natural sources. ABU80870-ABU81144 represent the human PRO 

CC polypeptides of the invention. Note: The sequence data for this patent 

CC was obtained in electronic format directly from the USPTO web site at 

CC seqdata.uspto.gov/psipsDIDEntry.html 
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PI Baker KP, Beresini M, Deforge L, Desnoyers L, Filvaroff E, Gao W; 

PI Gerritsen ME, Goddard A, Godowski PJ, Gurney AL, Sherwood S; 

PI Smith V, Stewart TA, Tumas D, Watanabe CK, Wood WI, Zhang Z; 
XX 

DR WPI; 2003-332040/31. 

DR N-PSDB; ACA03710. 
XX 

PT New secreted and transmembrane PRO nucleic acids, useful for gene 

PT therapy, in chromosome and gene mapping, as chromosome markers, in tissue 

PT typing, and in chromosome identification. 

XX 

PS Claim 12; Fig 216; 660pp; English. 
XX 

CC The present invention relates to the isolation of novel human PRO 

CC polypeptides, and the polynucleotide sequences encoding them. The PRO 

CC polypeptides are secreted and transmembrane proteins. The PRO 

CC polypeptides are useful for detecting other PRO polypeptides, for linking 

CC bioactive molecules to cells expressing PRO polypeptides, for modulating 

CC biological activities of cells expressing PRO polypeptides, and for for 

CC identifying agonists or antagonists. The PRO polypeptides are useful for 

CC for stimulating the release of tumour necrosis factor (TNF) -alpha from 

CC human blood, for stimulating the proliferation or differentiation of 

CC chondrocytes, and detecting the presence of tumours. The polynucleotide 

CC sequences encoding PRO polypeptides are useful as hybridisation probes, 

CC in chromosome and gene mapping, in the generation of antisense RNA and 

CC DNA, in the preparation of PRO polypeptides, for generating transgenic 



CC animals or knockout animals , for the genetic analysis of individuals with 

CC genetic disorders, and in gene therapy. ABU66570-ABU66844 represent the 

CC human PRO polypeptides of the invention. Note: The sequence data for this 

CC patent was obtained in electronic format directly from the USPTO web site 

CC at seqdata.uspto.gov/psipsDIDEntry.html 
XX 

SQ Sequence 379 AA; 



Query Match 99.6%; Score 1708; DB 6; Length 379; 

Best Local Similarity 99.7%; Pred. No. 8.3e-162; 

Matches 335; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 
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RESULT 10 
ABU59758 

ID ABU59758 standard; protein; 379 AA. 
XX 

AC ABU59758; 
XX 

DT 13-MAY-2003 (first entry) 
XX 

DE Novel secreted and transmembrane protein PRO6007. 
XX 

KW Human; PRO; hypertrophy of neonatal heart; angiogenesis; wound healing; 

KW cardiac insufficiency disorder; cancer; tumour; immune response; 

KW adrenal cortical capillary endothelial growth; c-fos induction; 

KW vascular endothelial growth factor inhibition; VEGF inhibition; 

KW endothelial cell growth inhibitor; T-lymphocytes stimulation; 

KW retinal neurons cell survival; rod photoreceptor cell survival; 

KW retinal disorder; retinitis pigmentosum; kidney disorder; 

KW mammalian kidney mesangial cell proliferation; Berger disease; 

KW dermatitis; herpetiformis; Crohn's disease; chondrocyte proliferation; 

KW chondrocyte redif f erentiation; sports injury; arthritis. 
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XX 

PA (GETH ) GENENTECH INC. 
XX 

PI Baker KP, Beresini M, Deforge L, Desnoyers L, Filvaroff E, Gao W; 

PI Gerritsen ME, Goddard A, Godowski PJ, Gurney AL, Sherwood S; 

PI Smith V, Stewart TA, Tumas Watanabe CK, Wood WI, Zhang Z; 



DR WPI; 2003-148238/14. 

DR N-PSDB; ABX89248. 
XX 

PT Two hundred and seventy five nucleic acids encoding PRO polypeptides , 

PT useful for treating pericyte-associated tumors, diabetes and various bone 

PT and/or cartilage disorders, e.g. arthritis. 

XX 

PS Claim 12; Fig 216; 659pp; English. 
XX 

CC The invention describes an isolated human PRO polypeptide. The PRO 

CC polypeptides are useful in detecting PRO polypeptides in a sample, in 

CC linking a bioactive molecule to a cell expressing a PRO polypeptide, and 

CC in modulating at least one biological activity of a cell expressing a PRO 

CC polypeptide. PR01312 stimulates hypertrophy of neonatal heart and is thus 

CC useful for treating cardiac insufficiency disorders. PR01154 and PR01186 

CC stimulate adrenal cortical capillary endothelial growth, and PR0536, 

CC PR0943, PR0828, PR0826, PRO1068 or PR0535, PR0826, PR0819, PR01126, 

CC PRO1360 and PR01387 induce c-fos in endothelial cells, and are thus 

CC useful for treating conditions or disorders where angiogenesis would be 

CC beneficial, e.g. wound healing and antagonist of this polypeptide are 

CC useful for treating cancerous tumours. PR0812 inhibits vascular 

CC endothelial growth factor (VEGF) stimulated proliferation of endothelial 

CC cells and is thus useful for inhibiting endothelial cell growth in 

CC mammals which would be beneficial in inhibiting tumour growth. PR0826, 

CC PRO1068, PR01184, PR01346 and PR01375 stimulate proliferation of 

CC stimulated T-lymphocytes and are therapeutically useful for enhancing 

CC immune response. PR0828, PR0826, PRO1068 or PR01132 enhance survival of 

CC retinal neurons cells (PR01132 is also enhances survival/proliferation of 

CC rod photoreceptor cells) and therefore are useful for treating retinal 

CC disorders of injuries, e.g. retinitis pigmentosum, AMD. PR0819, PR0813 

CC and PRO11066 induce proliferation of mammalian kidney mesangial cells, 

CC and therefore are useful for treating kidney disorders associated with 

CC decreased mesangial cell function such as Berger disease or other 

CC nephropathies associated with dermatitis, herpetiformis or Crohn's 

CC disease. PRO1310, PR0844, PR01312, PR01192 and PR01387 induce the 

CC proliferation and/or redif f erentiation of chondrocytes in culture and are 

CC thus useful for treating sports injuries, and arthritis. This is the 

CC amino acid sequence of a novel human PRO protein 

XX 

SQ Sequence 379 AA; 

Query Match 99.6%; Score 1708; DB 6; Length 379; 

Best Local Similarity 99.7%; Pred. No. 8.3e-162; 

Matches 335; Conservative 0; Mismatches 1; Indels 0; Gaps 0 

Qy 2 GDVT)DAGDCSGARYNDWSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRARR 61 

I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 44 GDVTDDAGDCSGARYNDWSDDDDDSNESKSIWYPPWARIGTEAGTRARARARARATRARR 103 

Qy 62 AVQKRASPNSDDTVLSPQELQKVXCLVTMSEKPYILEAALI 121 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 104 AVQKRAJSPNSDDTVLSPQELQKVLCLVEMSEKPYILEAALIALGNNAAYAFNRDIIRDLG 163 



Qy 

Db 



122 GLPIVAKILNTRDPIVlCEKALIVXNNLSWAENQRRLKvyMNQVCDDTITSRLNSSVQLA 181 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
164 GLPIVAKILNTRDPIWEKALIVl.NNLSWAENQRRLKVyMNQVCDDTITSRLNSSVQLA 223 



Qy 182 GLRLLTNMTWNEYQHMLANSISDFFRLFSAGNEETKLQVLKLL1N 241 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 224 GLRLLTNNTIVrNEYQHMLANSISD 283 

Qy 242 QVP S S LGSLFNKKENKEVI LKLLVI FENI NDN FKWE EN EPTQNQFG EG S LFFFLKEFQVC 301 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 284 QVP SSLGSLFNKKENKEVILKLLVT FEN I NDNFKWEENEPTQNQFGEGS LFFFLKEFQVC 343 

Qy 302 ADKXLGI ESHHDFLVKVKVGKFMAKLAEHMFPKSQE 337 

III I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 
Db 344 ADKVLGI ESHHDFLVKVKVGKFMAKLAEHMFPKSQE 379 



RESULT 11 
AB024948 

ID AB024948 standard; protein; 379 AA. 
XX 

AC AB024948; 
XX 

DT 05-SEP-2003 (first entry) 
XX 

DE Human secreted/transmembrane protein (PRO) #108. 
XX 

KW Human; PRO; secreted protein; transmembrane protein; tumour; cytostatic; 

KW gene therapy; tumour necrosis factor-alpha; TNF-alpha; blood; 

KW proteoglycan; cartilage; cytokine; peripheral blood mononuclear cell; 

KW PBMC; glucose uptake; FFA; skeletal muscle cell; adipocyte cell; 

KW chondrocyte cell proliferation; chondrocyte cell differentiation; 

KW pericyte cell; inner ear utricular supporting cell; T-lymphocyte cell; 

KW endothelial cell; A-peptide; factor VILA. 

XX 

OS Homo sapiens . 
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PN US2003036179-A1. 
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XX 

PA (GETH ) GENENTECH INC. 
XX 

PI Baker KP, Beresini M, Deforge L, Desnoyers L, Filvaroff E, Gao W; 

PI Gerritsen ME, Goddard A, Godowski PJ, Gurney AL, Sherwood S; 

PI Smith V f Stewart TA, Tumas D, Watanabe CK, Wood WI, Zhang Z; 
XX 

DR WPI; 2003-466355/44. 

DR N-PSDB; ACD41902. 
XX 

PT New isolated nucleic acid encoding a PRO polypeptide, e.g. PR01114 or 

PT PR04978, useful in molecular biology, chromosome and gene mapping, in 

PT generating antisense RNA and DNA, and in gene therapy. 
XX 

PS Claim 12; Fig 216; 659pp; English. 
XX 

CC The invention relates to an isolated nucleic acid comprising at least 80% 

CC sequence identity to a PRO (secreted and transmembrane protein) cDNA 

CC comprising a nucleic acid (a) encoding a PRO polypeptide, or its 

CC extracellular domain (with or without its associated signal peptide) , 

CC which comprises any of the 275 120-850 residue amino acid sequences, 

CC given in the specification; (b) comprising any of the 275 300-3500 

CC nucleotide sequences, given in the specification;' or (c) comprising the 

CC full-length coding sequence of the nucleotide sequences given in the 

CC specification, or of the DNA deposited under any of the American Type 

CC Culture Collection (ATCC) Accession Numbers listed in the specification. 

CC Also included are a vector comprising the novel nucleic acid, a host cell 

CC comprising the vector, producing a PRO polypeptide, the isolated PRO 

CC polypeptides detailed above, a chimaeric molecule comprising the PRO 

CC polypeptide of fused to a heterologous amino acid sequence, an anti-PRO 

CC antibody, detecting a PRO polypeptide in a sample suspected of containing 

CC the PRO polypeptide, linking a bioactive molecule to a cell expressing a 



CC PRO polypeptide, modulating at least one biological activity of a cell 

CC expressing a PRO polypeptide, stimulating the release of tumour necrosis 

CC factor-alpha (TNF-alpha) from human blood, (or proteoglycans from 

CC cartilage or cytokine from peripheral blood mononuclear cells (PBMC)), 

CC modulating the uptake of glucose or FFA by skeletal muscle cells or 

CC adipocyte cells, stimulating the proliferation or differentiation of 

CC chondrocyte cells (or proliferation of or gene expression in pericyte 

CC cells), stimulating the proliferation of inner ear utricular supporting 

CC cells (or of T-lymphocyte cells, or of endothelial cells), inhibiting the 

CC binding of A-peptide to factor VTIA, or differentiation of adipocyte 

CC cells, detecting the presence of a tumour in a mammal and an 

CC oligonucleotide probe derived from any of the nucleotide sequences given 

CC in the specification. The polynucleotide is useful in molecular biology, 

CC including uses as hybridisation probes, in chromosome and gene mapping, 

CC in generating antisense RNA and DNA, and in gene therapy. The 

CC polynucleotide may also be used in preparing PRO polypeptides by 

CC recombinant techniques, and in generating either transgenic animals or 

CC knock-out animals which, in turn, are useful in the development and 

CC screening of therapeutically useful reagents. The PRO polypeptide or the 

CC antibody is used in preparing a medicament for treating a condition 

CC responsive to the polypeptide or antibody, such as tumours, and in 

CC various diagnostic assays. The present sequence rperesents a PRO 

CC polypeptide 

XX 

SQ Sequence 379 AA; 

Query Match 99.6%; Score 1708; DB 6; Length 379; 

Best Local Similarity 99.7%; Pred. No. 8.3e-162; 

Matches 335; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

GDVTDDAGDCSGARYNDWSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRARR 61 
I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
GDVT)DAGDCSGARYNDWSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRARR 103 

AVQKRAS PNS DDTVLS PQELQKVLCLVEMS EKPYI LEAALI ALGNNAAYAFNRDI I RDLG 121 
I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
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GL P I VAK I LNT RD P I VKE KAL I VLNN L S VNAENQ RRL KVYMNQ VC D DT I T S RLN S S VQLA 181 
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GLPIVAKILNTRDPIVICEKALIVXNNLSWAENQRRLKVTMNQVCDDTITSRLNSSVQ^ 223 

GLRLLTNMTWNEYQHMLANSISDFFRLFSAGNEETKLQVXKLLLNLAENPAMTRELLRA 241 

I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
GL RLLTNMT VTN E YQHMLAN S I S D FFRLF S AGN E ET KLQVL KLLLN LAEN PAMT RELLRA 283 
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QVPSSLGSLFNKKENKEVILKLLVT FENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 343 

ADKXLGI ESHHDFLVKVKVGKFMAKLAEHMFPKSQE 337 
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ADKVLGI ESHHDFLVKVKVGKFMAKLAEHMFPKSQE 379 
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RESULT 12 
ABU66953 



ID ABU66953 standard; protein; 379 AA. 
XX 

AC ABU66953; 
XX 

DT 27-MAY-2003 (first entry) 
XX 

DE Human secreted/ transmembrane , PRO, protein SEQ ID 216. 
XX 

KW Human; secreted protein; transmembrane protein; PRO; 

KW inflammatory disease; organ failure; atherosclerosis; cardiac injury; 

KW infertility; birth defects; premature aging; AIDS; biosensor; 

KW acquired immunodeficiency syndrome; cancer; diabetic complication; 

KW bioreactor; tumour. 

XX 

OS Homo sapiens. 
XX 

PN US2003032155-A1. 
XX 
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PR 29-JUN-2001; 2001WO-US021066 . 

PR 09-JUL-2001; 2001WO-US021735 . 

PR 18-JUL-2001; 2001US-00908827 . 
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PR 16-AUG-2001; 2001US-00931836 . 

PR 19-DEC-2001; 2001US-00028072 . 
XX 

PA (GETH ) GENENTECH INC. 
XX 

PI Baker KP, Beresini M, Deforge L, Desnoyers L, Filvaroff E, Gao W; 

PI Gerritsen ME, Goddard A, Godowski PJ, Gurney AL, Sherwood S; 

PI Smith V, Stewart TA, Tumas D, Watanabe CK, Wood WI, Zhang Z; 
XX 

DR WPI; 2003-331925/31. 

DR N-PSDB; ACA04131. 
XX 

PT New secreted and transmembrane nucleic acids and polypeptides, designated 

PT as PRO, useful for treating inflammation, organ failure, atherosclerosis, 

PT cardiac injury, infertility, birth defects, premature aging, AIDS, or 

PT cancer. 
XX 

PS Claim 12; Fig 216; 659pp; English. 
XX 

CC The invention relates to an isolated nucleic acid comprising, or which is 

CC at least 80% identical to, or the full-length coding sequence of, any of 

CC the 275 nucleotide sequences, encoding the corresponding PRO polypeptide 

CC (one of 275 secreted or transmembrane proteins) . The nucleic acid further 

CC comprises the full-length coding sequence of the DNA deposited under 

CC American Type Culture Collection (ATCC) accession number in a list given 

CC in the specification. Also included are vectors and host cells for 

CC producing PRO proteins, PRO fusion proteins, anti-PRO antibodies, PRO 

CC extracellular domains and mature sequences, methods of detecting PRO 

CC proteins, methods for stimulating the release of TNF-alpha (tumour 

CC necrosis factor alpha) from human blood, (and the proliferation of 

CC differentiation of chondrocyte cells, the proliferation of, or gene 

CC expression in pericyte cells, the release or proteoglycans from 

CC cartilage, proliferation of inner ear urticular supporting cells, the 

CC proliferation of T-lymphocyte cells, the release of a cytokine from 

CC peripheral blood mononuclear cells (PBMC) , or the proliferation of 

CC endothelial cells), a method for modulating the uptake of glucose or free 

CC fatty acid (FFA) by skeletal muscle cells, a method for inhibiting the 

CC binding of A-peptide to factor VIIA, or the differentiation of adipocyte 

CC cells, a method for detecting the presence of a tumour in a mammal and an 

CC oligonucleotide probe derived from any of the nucleotide sequences cited 

CC above. The nucleic acids and polypeptides are useful for treating. 

CC inflammatory diseases, organ failure, atherosclerosis, cardiac injury, 

CC infertility, birth defects, premature aging, AIDS (acquired 

CC immunodeficiency syndrome), cancer, or diabetic complications. The 

CC nucleic acids are useful as hybridisation probes, in chromosome and gene 

CC mapping, and in generating antisense RNA or DNA. The polypeptides are 

CC useful as pharmaceuticals, diagnostics, biosensors or bioreactors. Both 

CC are useful in tissue typing. The present sequence represents a PRO 

CC protein of the invention 

XX 

SQ Sequence 379 AA; 



Query Match 99.6%; Score 1708; DB 6; Length 379; 

Best Local Similarity 99.7%; Pred. No. 8.3e-162; 

Matches 335; Conservative 0; Mismatches 1; Indels 0; Gaps 0 
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Db 44 G DVT) DAGDCS GAR YNDWSDDDDDSNESKS I VWYP PWARI GTEiAGTRARARARARAT RARR 103 

Qy 62 AVQKRAS PNSDDTVLS PQELQKVLCLVEMSEKPYI LEAALIALGNNAAYAFNRDI I RDLG 121 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 104 AVQKRAS PNSDDTVLS PQELQKVLCLVEMSEKPYI LEAALIALGNNAAYAFNRDI I RDLG 163 

Qy 122 GLPIVAKILNTRDPIVKEKALIVXNNLSWAENQRRLKVYMNQVCDDTITSRLNSSVQ^ 181 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 164 GLPIVAKILNTRDPIvTCEKALIVTJWLSWAENQRRLKV^ 223 

Qy 182 GLRLLTNMTVTNEYQHMLANSISDFFRLFSAGNEETKLQVXKLLLNLAENPAMTRELL^ 241 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 224 GLRLLTNMTVTNEYQHMLANSISDFFRLFSAGNEETKLQVTjKLLLNLAENPAMTRELLRA 283 

Qy 242 QVPSSLGSLFNKKENKEVTLKLLVIFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 301 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 284 QVPSSLGSLFNKKENKEVTLKLLVIFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 343 

Qy 302 ADKXLGIESHHDFLVKVKVGKFMAKLAEHMFPKSQE 337 

III I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I 
Db 344 ADKVTjGIESHHDFLVKVKVGKFMAKLAEHMFPKSQE 379 



RESULT 13 
ADA45735 



ID ADA45735 standard; protein; 379 AA. 
XX 

AC ADA45735; 
XX 

DT 20-NOV-2003 (first entry) 
XX 

DE Novel human secreted and transmembrane protein PRO6007. 
XX 

KW Human; secreted and transmembrane protein; PRO; 

KW Tumour necrosis factor alpha release; TNF-alpha release; 

KW glucose uptake modulator; FFA uptake modulator; 

KW cell proliferation stimulator; cell differentiation stimulator; 

KW cell differentiation inihibitor; cytokine release stimulator; tumour; 



KW lung tumoue; colon tumour; breast tumour; prostate tumour; rectal tumour; 

KW cervical tumour; liver tumour; chromosome mapping; gene mapping; 

KW gene therapy; chromosome identification; chromosome marker. 
XX 

OS Homo sapiens . 
XX 

PN US2003022328-A1. 
XX 

PD 30-JAN-2003. 
XX 

PF 16-APR-2002; 2002US-00123904 . 
XX 

PR 31-MAR-1997; 97WO-US005230 . 
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XX 

PA (GETH ) GENENTECH INC. 
XX 

PI Baker KP, Beresini M, Deforge L, Desnoyers L, Filvaroff E, Gao W; 

PI Gerritsen ME, Goddard A, Godowski PJ, Gurney AL, Sherwood S; 

PI Smith V, Stewart TA, Tumas D, Watanabe CK, Wood WI, Zhang Z; 
XX 

DR WPI; 2003-584997/55. 

DR N-PSDB; ADA45734. 
XX 

PT Novel secreted and transmembrane polypeptide for modulating biological 

PT activity of cell expressing the polypeptide, identifying agonists or 

PT antagonists of polypeptide, and as molecular weight markers. 
XX 

PS Claim 12; Fig 216; 659pp; English. 
XX 



CC The invention describes 305 nucleic acids encoding PRO (secreted and 

CC transmembrane) polypeptides (I) . (I) is useful for stimulating the 

CC release of TNF-alpha from human blood, for modulating the uptake of 

CC glucose or FFA by skeletal muscle cells or adipocyte cells, for 

CC stimulating the proliferation or differentiation of chondrocyte cells, 

CC for stimulating the proliferation of or gene expression in pericyte 

CC cells, for stimulating the release of proteoglycans from cartilage, for 

CC stimulating the proliferation of inner ear utricular supporting cells, 

CC for stimulating the proliferation of T-lymphocyte cells, for stimulating 

CC the release of a cytokine from PBMC cells, for inhibiting the binding of 

CC A-peptide to factor VTIA, for inhibiting the differentiation of adipocyte 

CC cells, for stimulating proliferation of endothelial cells, for detecting 

CC the presence of tumour in a mammal. The tumour is lung, colon, breast, 

CC prostate, rectal, cervical or liver tumour. The oligonucleotide probes 

CC are useful for isolating genomic and cDNA nucleotide sequences or 

CC antisense probes. (I) is also useful as therapeutic agent. PRO is useful 

CC in assays to identify other proteins or molecules involved in binding 

CC interaction. A polynucleotide (II) encoding (I) is useful in chromosome 

CC and gene mapping, in generation of antisense RNA and DNA, in the 

CC preparation of PRO polypeptide, for generating transgenic animals or 

CC knockout animals which in turn are useful in the development and 

CC screening of therapeutically useful reagents, in gene therapy, for 

CC chromosome identification, as chromosome marker, and for generating 

CC probes. An anti- (I ) -antibody is useful in diagnostic assays for PRO, e.g. 

CC detecting its expression in specific cells, tissues or serum, and for 

CC affinity purification of PRO from recombinant cell culture or natural 

CC sources. (I) and (II) are useful for tissue typing. This is the amino 

CC acid sequence of a novel human secreted and transmembrane PRO 

CC polypeptide. 

XX 

SQ Sequence 379 AA; 



Query Match 99.6%; Score 1708; DB 6; Length 379; 

Best Local Similarity 99.7%; Pred. No. 8.3e-162; 

Matches 335; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 



Qy 2 GDVDDAGDCSGARYNDWSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRARR 61 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 44 GDVDDAGDCSGARYNDWSDDDDDSNESKSIWYPPWARIGTEAGTRARARARARATRARR 103 

Qy 62 AVQKRAS PNS DDTVLS PQELQKVLCLVEMSEKP YI LEAALI ALGNNAAYAFNRDI I RDLG 121 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I I I I I I I I 
Db 104 AVQKRAS PN S DDTVLS PQELQKVLCLVEMS EKP YI LEAALI ALGNNAAYAFNRDI I RDLG 163 

Qy 122 GLPIVAKILNTRDPIVTCEKALIVXNNLSWAENQRRLKWMNQVCDDTITSRLNSSVQ^ 181 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I- I I I I I I I I II I I II I I I I I I 
Db . 164 GLPIVAKILNTRDPIV1<EKALIVXNNLSWAENQRRLKVYMNQVCDDTITSRLNSSVQLA 223 

Qy 182 GLRLLTNMTWNEYQHMLANSISDFFRLFSAGNEETKLQVLKLLLNLAENPAMTRELLRA 241 
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Db 224 GLRLLTNMTWNEYQHMLANSISDFFRLFSAGNEETKLQVLKLLLNLAENPAMTRELLRA 283 

Qy 242 QVPSSLGSLFNKKENKEVTLKLLVIFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 301 
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Db 284 QVPSSLGSLFNKKENKEVILKLLVIFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 343 

Qy 302 ADKXLGI ESHHDFLVKVKVGKFMAKLAEHMFPKSQE 337 
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RESULT 14 
ADA76166 

ID ADA76166 standard; protein; 379 AA. 
XX 

AC ADA76166;- 
XX 

DT 20-NOV-2003 (first entry) 
XX 

DE Human PRO polypeptide #108. 
XX 

KW Human; PRO; secreted polypeptide; transmembrane polypeptide; 

KW tumour necrosis factor-alpha; TNF-alpha; chondrocyte cell; tumour; 

KW .cancer; adrenal; lung; colon; breast; prostate; rectum; kidney; cervix; 

KW liver; microvascular endothelial cell; glucose; FFA; 

KW skeletal muscle cell; adipocyte cell; pericyte cell; 

KW inner ear utricular supporting cell; T-lymphocyte cell; 

KW endothelial cell tube formation; bone disorder; cartilage disorder; 

KW sports injury; proteoglycan; articular cartilage defect; osteoarthritis; 

KW rheumatoid arthritis; haemoglobin-associated disorder thalassaemia; 

KW immune system cell infiltration. 

XX 

OS Homo sapiens. 
XX 

PN US2003073212-A1. 
XX 

PD 17-APR-2003. 
XX 

PF 16-APR-2002; 2002US-00123903 . 
XX 

PR 31-MAR-1997; 97WO-US005230 . 

PR 12-JUN-1998; 98WO-US012456. 
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PR 
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PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
PR 
XX 

PA (GETH ) GENENTECH INC. 
XX 

PI Baker KP, Beresini M, Deforge L, Desnoyers L, Filvaroff E, Gao W; 
PI Gerritsen ME, Goddard A, Godowski PJ, Gurney AL, Sherwood S; 
PI Smith V, Stewart TA, Tumas D, Watanabe CK, Wood WI, Zhang Z; 
XX 

DR WPI; 2003-687639/65. 
DR N-PSDB; ADA76165. 
XX 

PT New isolated nucleic acid encoding a secreted and transmembrane 

PT polypeptide, designated e.g. PR01114 or PR04978, useful in chromosome and 

PT gene mapping, in generating antisense RNA and DNA, and in gene therapy. 

XX 

PS Claim 12; Fig 216; 659pp; English. 
XX 

CC The invention relates to isolated human PRO polypeptides (secreted and 

CC transmembrane polypeptides) and the polynucleotides encoding them. The 

CC invention also relates to an antibody which specifically binds to a PRO 

CC polypeptide, a method for stimulating the release of tumour necrosis 

CC factor-alpha (TNF-alpha) from human blood, a method for stimulating the 

CC proliferation or differentiation of chondrocyte cells and a method for 

CC detecting the presence of a tumour in a mammal (e.g. adrenal, lung, 

CC colon, breast, prostate, rectal, kidney, cervical and liver tumours) . The 

CC polynucleotides are useful in molecular biology, including uses as 

CC hybridisation probes, in chromosome and gene mapping, in generating 

CC antisense RNA and DNA and in gene therapy. The polynucleotides may also 

CC be used in preparing PRO polypeptides by recombinant techniques. -and in 

CC generating either transgenic animals or knock-out animals which are 

CC useful in the development and screening of therapeutically useful 

CC reagents . The PRO polypeptides or antibodies are used in preparing a 

CC medicament for treating a condition responsive to the polypeptides or 

CC antibodies, such as tumours, for stimulating and inhibiting proliferation 

CC of human microvascular endothelial cells, for modulating the uptake of 

CC glucose or FFA by skeletal muscle cells or adipocyte cells, for 

CC stimulating differentiation of adipocyte cells, for stimulating 

CC proliferation of or gene expression in pericyte cells, for stimulating 

CC the proliferation of inner ear utricular supporting cells or T-lymphocyte 

CC cells, for inducing endothelial cell tube formation and for treating 



CC various bone and/or cartilage disorders such as sports injuries and 

CC arthritis. PRO polypeptides which stimulate the release of proteoglycans 

CC from cartilage are useful for treating sports-related joint problems, 

CC articular cartilage defects, osteoarthritis and rheumatoid arthritis. PRO 

CC polypeptides are also useful for treating various mammalian haemoglobin- 

CC associated disorders such as various thalassaemias and conditions which 

CC may benefit from enhanced local immune system cell infiltration. This 

CC sequence represents a human PRO polypeptide of the invention. Note: The 

CC sequence data for this patent is also available in electronic format from 

CC USPTO at seqdata.uspto.gov/sequence.html. 

XX 

SQ Sequence 379 AA; 



Query Match 99.6%; Score 1708; DB 6; Length 379; 

Best Local Similarity 99.7%; Pred. No. 8.3e-162; 

Matches 335; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 



Qy 


2 


GDVT)DAGDCSGARYNDWSDDDDDSNESKSIWYPPWARIGTEAGTRARARARARATRARR 


61 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 




Db 


44 


GDVDDAGDCSGARYNDWSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRARR 


103 


Qy 


62 


AVQKRASPNSDDTVLSPQELQKVXCLVEMSEKPYILEAA 


121 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 M 1 1 




Db 


104 


AVQKRASPNSDDTVIiSPQELQKVLCLVlSMSEKPYILEAA 


163 


Qy 


122 


GLPIVAKILNTRDPIVTCEKALIVXNNLSWAENQRRLKVTMNQVCDDTITSRLNSSVQ^ 


181 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


164 


GLPIVAKILNTRDPIVl^EKALIVLNNLSWAENQRRLKVTMNQVCDDTITSRLN 


223 


Qy 


182 


GLRLLTNMTVTNEYQHMLANS I SDFFRLFSAGNEETKLQVLKLLLNLAENPAMTRELLRA 


241 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 II 




Db 


224 


GLRLLTNMTVTNEYQHMLANS I SDFFRLFSAGNEETKLQVLKLLLNLAENPAMTRELLRA 


283 


Qy 


242 


QVPSSLGSLFNKKENKEVILKLLVI FENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 


301 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


284 


QVPSSLGSLFNKKENKEVILKLLVI FENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 


343 


Qy 


302 
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RESULT 15 
ADA18816 

ID ADA18816 standard; protein; 379 AA. 
XX 

AC ADA18816; 
XX 

DT 20-NOV-2003 (first entry) 
XX 

DE Human PRO polypeptide #108. 
XX 

KW Human; PRO; secreted polypeptide; transmembrane polypeptide; 

KW tumour necrosis factor-alpha; TNF-alpha; blood; chondrocyte cell; lung; 

KW colon; breast; prostate; rectum; cervix; liver; tumour; cancer; 
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XX 

PA (GETH ) GENENTECH INC. 
XX 

PI Baker KP, Beresini M, Deforge L, Desnoyers L, Filvaroff E, Gao W; 
PI Gerritsen ME, Goddard A, Godowski PJ, Gurney AL, Sherwood S; 



PI Smith V, Stewart TA, Turaas D, Watanabe CK, Wood WI, Zhang Z; 
XX 

DR WPI; 2003-521854/49. 

DR N-PSDB; ADA18815. 
XX 

PT New PRO nucleic acid, useful for preparing a composition for treating 

PT e.g., tumors. 

XX 

PS Claim 12; Fig 216; 660pp; English. 
XX 

CC The invention relates to isolated human PRO polypeptides (secreted and 

CC transmembrane polypeptides) and the polynucleotides encoding them. The 

CC invention also relates to an antibody which specifically binds to a PRO 

CC polypeptide, a method for stimulating the release of tumour necrosis 

CC factor-alpha (TNF-alpha) from human blood, a method for stimulating the 

CC proliferation or differentiation of chondrocyte cells and a method for 

CC detecting the presence of a tumour in a mammal {e.g. lung, colon, breast, 

CC prostate, rectal, cervical and liver tumours) . The polynucleotides are 

CC useful in molecular biology, including uses as hybridisation probes, in 

CC chromosome and gene mapping, in generating antisense RNA and DNA and in 

CC gene therapy. The polynucleotides may also be used in preparing PRO 

CC polypeptides by recombinant techniques and in generating either 

CC transgenic animals or knock-out animals which are useful in the 

CC development and screening of therapeutically useful reagents. The PRO 

CC polypeptides or antibodies are used in preparing a medicament for 

CC treating a condition responsive to the polypeptides or antibodies, such 

CC as tumours, for modulating the uptake of glucose or FFA by adipocyte 

CC cells, for stimulating the proliferation of or gene expression in 

CC pericyte cells, for stimulating the release of proteoglycans from 

CC cartilage, for stimulating the proliferation of inner ear utricular 

CC supporting cells, for stimulating the release of cytokines from PBMC 

CC cells, for inhibiting the binding of A-peptide to factor VIIA, for 

CC inhibiting the differentiation of adipocyte cells and for stimulating the 

CC proliferation of endothelial cells. This sequence represents a human PRO 

CC polypeptide of the invention. Note: The sequence data for this patent is 

CC also available in electronic format from USPTO at 

CC seqdata.uspto.gov/sequence.html . 

XX 

SQ Sequence 379 AA; 

Query Match 99.6%; Score 1708; DB 6; Length 379; 

Best Local Similarity 99.7%; Pred. No. 8.3e-162; 

Matches 335; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 
Qy 2 GDVDDAGDCSGARYNDWSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRARR 61 

I I N I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I Mil UN 

Db 44 GDVI)DAGDCSGARYNDWSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRARR 103 

Qy 62 AVQKRASPNSDDTVTjSPQELQKVXCLV^MSEKPYILEAALIALGNNAAYAFNRDIIRDLG 121 

I I I I I I II I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 
Db 104 AVQKRAS pns ddtvls pqelqkvlclvems ekp yi leaali ALGNNAAYAFNRDI I RDLG 163 

Qy 122 GLPI VAKI LNTRDP I VXEKALI VLNNLS WAENQRRLKVTMNQVCDDT I TS RLNS S VQLA 181 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 164 GLP I VAKI LNTRDP I VICE KALI VXNNLSWAENQ RRLKVYMNQVC DDT ITS RLNS S VQLA 223 



Qy 182 GLRLLTNMTVTNEYQHMLANS I SDFFRLFSAGNEETKLQVLKLLLNLAENPAMTRELLRA 241 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 224 GLRLLTNMTVTNEYQHMLANSISDFFR^ 283 

Qy 242 QVPSSLGSLFNKKENKEVILKLLVIFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 301 

I I I I I I I I I I I I I I I I I 11 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 284 QVPSSLGSLFNKKENKEVILKLLVT FENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 343 

Qy 302 ADKXLGIESHHDFLVKVKVGKFMAKLAEHMFPKSQE 337 

III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 344 ADKVLGIESHHDFLVKVKVGKFMAKLAEHMFPKSQE 379 



Search completed: January 7, 2005, 14:48:37 
Job time : 77.8455 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2005 Compugen Ltd. 



OM protein - protein search, using sw model 



Run on: 



Title: 

Perfect score: 
Sequence: 

Scoring table: 



January 7, 2005, 13:52:30 ; Search time 22.1064 Seconds 

(without alignments) 
1010.981 Million cell updates/sec 

US-10-726-721A-9 
1715 

1 RG DVD DAGDC S GARYN DWS D VKVGKFMAKLAEHMFPKSQE 337 

BLOSUM62 

Gapop 10.0 , Gapext 0.5 



Searched: 478139 seqs, 66318000 residues 

Total number of hits satisfying chosen parameters: 478139 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 

Database : Issued_Patents_AA: * 

1 : / cgn2_6/ptodata/ 1/iaa/ 5A_COMB . pep : * 

2: /cgn2_6/ptodata/l/iaa/5B_COMB.pep:* 

3 : / cgn2_6/ptodata/ l/iaa/6A_COMB . pep : * 

4 : /cgn2_6/ptodata/l/iaa/6B_COMB.pep: * 

5 : / cgn2_6/ptodata/ l/iaa/PCTUS_COMB . pep : * 

6: /cgn2_6/ptodata/l/iaa/backfilesl .pep: * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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11, Appl 
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11, Appl 
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ALIGNMENTS 



RESULT 1 

US-09-780-996A-9 

; Sequence 9, Application US/09780996A 

; Patent No. 6696273 

; GENERAL INFORMATION: 

; APPLICANT: Maury, Isabella 

; APPLICANT: Mercken, Luc 

; APPLICANT: Fournier, Alain 

; TITLE OF INVENTION: Partners of the PTB1 Domain of FE65, Preparation and Uses 

; FILE REFERENCE: ST00004-US 

; CURRENT APPLICATION NUMBER: US/09/7 80, 996A 

; CURRENT FILING DATE: 2001-02-09 

; PRIOR APPLICATION NUMBER: FR00/01628 

; PRIOR FILING DATE: 2000-02-10 

; PRIOR APPLICATION NUMBER: US 60/198,500 

; PRIOR FILING DATE: 2000-04-18 

; NUMBER OF SEQ ID NOS : 11 

; SOFTWARE: Patentln version 3.2 



; SEQ ID NO 9 
; LENGTH: 337 
; TYPE: PRT 

; ORGANISM: Homo sapiens 
FEATURE: 

NAME/ KEY: misc_f eature 

OTHER INFORMATION: X=G, D, V, or A 

FEATURE : 

NAME/KEY: misc_f eature 

LOCATION: (305) . . (305) 
; OTHER INFORMATION: Xaa can be any naturally occurring amino acid 
US-09-780-996A-9 

Query Match 99.9%; Score 1713; DB 4; Length 337; 

Best Local Similarity 100.0%; Pred. No. 2e-175; 

Matches 337; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 1 RGDVDDAGDCSGARYNDWSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRAR 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ! I I I 
Db 1 RGDVT5DAGDCSGARYNDWSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRAR 60 

Qy 61 RAVQKRAS PNS DDTVLS PQELQKVLCLVEMS EKP YI LEAALI ALGNNAAYAFNRDI I RDL 120 

I I I I hi I I I I I I I I I I I I I I I I I I I I I I I 1 1 II I I I I I I I I I I I I I I I 1 1 I I I I I 1 1 I I I 

Db 61 RAVQKRAS PN S DDT VL S PQELQKVLC LVEMS EKP Y I LEAALIALGNNAAYAFNRDI I RDL 120 

Qy 121 GGLPIVAKILNTRDPI VTCEKALI VXNNLSWAENQRRLKVTMNQVCDDTITSRLNSSV 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 
Db 121 GGLPIVAKILNTRDPIVTCEKALIVljNNLSWAENQRRLKVYMNQVCDDTITSRLNSSV 180 

Qy 181 AGLRLLTNMTVTNEYQHMLANSISDFFRLFSAGNEETKLQVljKLLLNLT^ENPAMTRELLR 240 

1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 

Db 181 AGLRLLTNMTWNEYQHMLANSISDFFRLFSAGNEETKLQVLKLLLNLAENPAMTRELLR 240 

Qy 241 AQVPSSLGSLFNKKENKEVILKLLVI FENINDNFKWEENEPTQNQFGEGSLFFFLKEFQV 300 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 AQVPSSLGSLFNKKENKEVILKLLVI FENINDNFKWEENEPTQNQFGEGSLFFFLKEFQV 300 

Qy 301 CADKXLGI ESHHDFLVTCVKVGKFMAKLAEHMFPKSQE 337 

I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 CADKXLGI ESHHDFLVTCVKVGKFMAKLAEHMFPKSQE 337 



RESULT 2 
US-09-193-266-1 

; Sequence 1, Application US/09193266 

; Patent No. 6214582 

; GENERAL INFORMATION: 

; APPLICANT: Marcu, Kenneth B. 

; TITLE OF INVENTION: Y2H35 A Strong IKK Binding Protein 

; FILE REFERENCE: 178-257 

; CURRENT APPLICATION NUMBER: US/09/193,266 

; CURRENT FILING DATE: 1998-11-16 

; NUMBER OF SEQ ID NOS : 3 

; SOFTWARE: Patent In Ver. 2.0 

; SEQ ID NO 1 

; LENGTH: 379 

; TYPE: PRT 



; ORGANISM: Homo sapiens 
US-09-193-266-1 



Query Match 99.6%; Score 1708; DB 3; Length 379; 

Best Local Similarity 99.7%; Pred. No. 8.3e-175; 

Matches 335; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 



Qy 2 GDVT)DAGDCSGARYNDWSDDDDDSNESKSIWYPPWARIGTEAGTRARARARARATRARR 61 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 44 GDVT)DAGDCSGARYNDWSDDDDDSNESKSIWYPPWARIGTEAGTRARARARARATRARR 103 

Qy 62 AVQKRAS PNS DDTVLS PQELQKVLCLVEMS EKP YI LEAALI ALGNNAAYAFNRDI I RDLG 121 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 104 AVQKRAS PNSDDTVLS PQELQKVLCLVEMS EKP YI LEAALI ALGNNAAYAFNRDI I RDLG 163 

Qy 122 GLPIVAKILNTRDPIV^EKALIVXNNLSWAENQRRLKVTMNQVCDDTITSRLNSSVQ^ 181 

I I I I I I II I I I I I I I II I I I I II I I I I I I I I I I I I I 1 I I I I I I I II I I I 1 I I I I II I I I I 
Db 164 GLPIVAKILNTRDPIV1<EKALIVLNNLSWAENQRRLKVTMNQVCDDTITSRLNSSVQ^ 223 

Qy 182 GLRLLTNMTWNEYQHMLANSISDFFRLFSAGNEETKLQV^ 241 

I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 
Db 224 GLRLLTNMTVTNEYQHMLANSISDFFRLFSAGNEETKLQVXKLLLNLAENPAMTRELLRA 283 

Qy 242 QVPSSLGSLFNKKENKEVILKLLVTFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 301 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I II I I I I I I I I I I I I I I 
Db 284 QVPSSLGSLFNKKENKEVTLKLLVT FENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 343 

Qy 302 ADKXLGIESHHDFLVKVKVGKFMAKLAEHMFPKSQE 337 

I M I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 
Db 344 ADKVLGIESHHDFLVKVKVGKFMAKLAEHMFPKSQE 379 



RESULT 3 

US-10-140-002-216 

Sequence 216, Application US/10140002 
Patent No. 6725730 
GENERAL INFORMATION: 
APPLICANT: Baker, Kevin P. 

Beresini, Maureen 
DeForge, Laura 
Desnoyers , Luc 
Filvaroff, Ellen 
Gao f Wei-Qiang 
Gerritsen,Mary E. 
Goddard, Audrey 
Godowski, Paul J. 
Gurney, Austin L. 
Sherwood, Steven 
Smith, Victoria 
Stewart, Timothy A. 
Tumas, Daniel 
Watanabe, Colin K 
Wood, William 
Zhang, Zemin 

INVENTION: SECRETED AND TRANSMEMBRANE POLYPEPTIDES AND NUCLEIC 
INVENTION: 



APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
TITLE OF 
TITLE OF 



ACIDS ENCODING THE SAME 



FILE REFERENCE: P3330R1C59 



; CURRENT APPLICATION NUMBER: US/10/140,002 
; CURRENT FILING DATE: 2002-05-06 

; Prior Application removed - See Palm or File Wrapper 
; NUMBER OF SEQ ID NOS : 550 
; SEQ ID NO 216 
; LENGTH: 379 

TYPE: - PRT 
; ORGANISM: Homo Sapien 
US-10-140-002-216 

Query Match 99.6%; Score 1708; DB 4; Length 379; 

Best Local Similarity 99.7%; Pred. No. 8.3e-175; 

Matches 335; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 



Qy 2 GDVTDDAGDCSGARYNDWSDDDDDSNESKSIWYPPWARIGTEAGTRARARARARATRARR 61 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I II I I I I I I I I I I I I I I I I I I ! I I I I I I 
Db 44 GDVDDAGDCSGARYNDWSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRARR 103 

Qy 62 AVQKRAS PNSDDTVLSPQELQKVLCLVEMSEKPYI LEAALIALGNNAAYAFNRDI I RDLG 121 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I II 
Db 104 AVQKRA5PNSDDTVXSPQELQKVXCLVEMSEKPYI LEAALIALGNNAAYAFNRDI IRDLG 163 

Qy 122 GLPIVAKILNTRDPIVlCEKALIVl^NLSWAENQRRLKWMNQVCDDTITSRLNSSVQ^ 181 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 164 GLPIVAKILNTRDPIVXEKALIVXNNLSWAENQRRLKW 223 

Qy 182 GLRLLTNMTVTNEYQJmLANSISDFFRLFSAGNEETKLQVXKLLLNLAENPAMTRELLRA 241 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 224 GLRLLTNMTWNEYQHMLANSISDFFRLFSAGNEETKLQVLKLLLNLAENPAMTRELLRA 283 

Qy 242 QVPSSLGSLFNKKENKEVTLKLLVT FENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 301 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
Db 284 QVPSSLGSLFNKKENKEVTLKLLVI FENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 343 

Qy 302 ADKXLGI ESHHDFLVKVKVGKFMAKLAEHMFPKSQE 337 

III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 344 ADKVXGIESHHDFLVKVKVGKFMAKLAEPIMFPKSQE 379 



RESULT 4 

US-10-000-489-30 

; Sequence 30, Application US/10000489 

; Patent No. 6794363 

; GENERAL INFORMATION: 

; APPLICANT: Benjanin, Stephane 

; APPLICANT: Tanaka, Hiroaki 

; TITLE OF INVENTION: HUMAN CDNAS AND PROTEINS AND USES THEREOF 

; FILE REFERENCE: 91.US6.DIV 

; CURRENT APPLICATION NUMBER: US/10/000,489 

; CURRENT FILING DATE: 2001-11-14 

; PRIOR APPLICATION NUMBER: US 09/924,340 

; PRIOR FILING DATE: 2001-08-06 

; PRIOR APPLICATION NUMBER: PCT/IB01/ 017 15 

; PRIOR FILING DATE: 2001-08-06 

; PRIOR APPLICATION NUMBER: US 60/305,456 

; PRIOR FILING DATE: 2001-07-13 

; PRIOR APPLICATION NUMBER: US 60/302,277 



; PRIOR FILING DATE: 2001-06-29 

; PRIOR APPLICATION NUMBER: US 60/298,698 

; PRIOR FILING DATE: 2001-06-15 

; PRIOR APPLICATION NUMBER: US 60/293,574 

; PRIOR FILING DATE: 2001-05-25 

; NUMBER OF SEQ ID NOS : 112 

; SOFTWARE: JPatent 

; SEQ ID NO 30 

; LENGTH: 258 

TYPE: PRT 
; ORGANISM: Homo sapiens 

FEATURE: 

NAME/ KEY: SIGNAL 
LOCATION: 1..20 
NAME/ KEY: UNSURE 
LOCATION: 49 

OTHER INFORMATION: Xaa = Glu, * 
US-10-000-489-30 

Query Match 11.9%; Score 203.5; DB 4; Length 258; 

Best Local Similarity 46.5%; Pred. No. 2e-13; 

Matches 47; Conservative 19; Mismatches 22; Indels 13; Gaps 3 

Qy 39 RIGTEAGTRA RARARARATRA RRAVQKRAS PNS DDTVLS PQELQKVLCL 87 

I I : I I I I : : | | : : : M | | | | | I : I I : | | | | | : 

Db 156 RSGSRAGGRASGKSKGKARSKSTRAPATTWPVRRG — KFNFPYKIDDILSAPDLQKVLNI 213 

Qy 88 VEMS EKP YI LEAALI ALGNNAAYAFNRDI I RDLGGLPI VAK 128 

: I : 1:1 I I I : I I I I I I I : I I : : I I : I I I : I I : I I 
Db 214 LERTNDPFIQEVALVTLGNNAAYSFNQNAIRELGGVPIIAK 254 



RESULT 5 

US-09-513-999C-7259 

; Sequence 7259, Application US/09513999C 
; Patent No. 6783961 
; GENERAL INFORMATION: 

; APPLICANT: Dumas Milne Edwards, J.B. 

; APPLICANT: Duclert, A. 

; APPLICANT: Giordano, J.Y. 

; TITLE OF INVENTION: Expressed Sequence Tags and Encoded Human Proteins. 

; Patent No. 6783961 

; FILE REFERENCE: 59.US2.REG 

; CURRENT APPLICATION NUMBER: US/09/513, 999C 

; CURRENT FILING DATE: 2000-02-24 

; PRIOR APPLICATION NUMBER: US 60/122,487 

; PRIOR FILING DATE: 1999-02-26 

; NUMBER OF SEQ ID NOS: 36681 

; SOFTWARE: Patent. pm 

; SEQ ID NO 7259 

; LENGTH: 59 

TYPE: PRT 
; ORGANISM: Homo sapiens 
US-09-513-999C-7259 



Query Match 10.3%; Score 176; DB 4; Length 59; 

Best Local Similarity 58.9%; Pred. No. 1.8e-ll; 



Matches 33; Conservative 12; Mismatches 11; Indels 0; Gaps 0; 



Qy 222 LKLLLNLAENPAMTRELLRAQWSSLGSLFNKKENKEVTLKLLVTFENINDNFKWE 277 

: I I : : I MINIMI: MM I Mill: :: I : : I : I MINIM I I 
Db 1 MKLIINFTENPAMTRELVSCKVPSELISLFNKEWDREILLNILTLFENINDNIKNE 56 



RESULT 6 

US-09-248-796A-20705 

Sequence 20705, Application US/09248796A 
Patent No. 6747137 
GENERAL INFORMATION: 
APPLICANT: Keith Weinstock et al 

TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO CANDIDA 
ALBICANS 

TITLE OF INVENTION: FOR DIAGNOSTICS AND THERAPEUTICS 
FILE REFERENCE: 107196.132 

CURRENT APPLICATION NUMBER: US/09/248, 796A 
CURRENT FILING DATE: 1999-02-12 
PRIOR APPLICATION NUMBER: US 60/074,725 
PRIOR FILING DATE: 1998-02-13 
PRIOR APPLICATION NUMBER: US 60/096,409 
PRIOR FILING DATE: 1998-08-13 
NUMBER OF SEQ ID NOS : 28208 
SEQ ID NO 20705 
LENGTH: 536 
TYPE: PRT 

ORGANISM: Candida albicans 
FEATURE : 

NAME/ KEY: UNSURE 
LOCATION: (123) 

OTHER INFORMATION: Identity of amino acid sequences at the above locations 
are unknown. 
US-09-248-796A-20705 

Query Match 6.4%; Score 109; DB 4; Length 536; 

Best Local Similarity 25.0%; Pred. No. 0.0088; 

Matches 65; Conservative 35; Mismatches 84; Indels 76; Gaps 14; 

Qy 116 IIRDLGGL-PIVAKILNTRDPIVKEKALIVl^NNLSWAEN-QRRLKVYMNQVCDDT — IT 171 

M I : I : I I : I I I I I M I I I I I I I I M M : : I : I I 
Db 243 LI NDVEGI S EI VLECLNDRDLI I KRKALEVSN YL- VNEDNI TEWKIMLMQLVPDNNMI D 301 

Qy 172 SRLNSSVQLAGLRLLTNMTVTNEYQHMLANSISDFFRLFSAGNEETKLQVLKLLLNL 228 

I : I I : : : I : I I I : I III : M I 

Db 302 DMLKLEITLKI LQI ASQNNYVN IPN FRWYVA VLKDVTNLTLL 343 

Qy 229 AENPAMTRELLRAQVPSSLGSLFNKKENK — EVILKLL — VI FEN I ND 272 

II: : :: : :| I | | II |: I : I 

Db 344 PVEGATNSGLIASHIANEISTEVGKEFKNLATKVPSVKSYLLQNVVLELVQDVRLLDSSA 403 

Qy 273 NFKW EENEPTQNQFGEG SLFFFLKEFQV 300 

: I I : I I I I : : I I : I : 

Db 404 LILKDLYWILGEYISELKVTQNDDGDDDSDSDSDGEDAQVKVXDIDKKIKVFNTLINYQI 463 



Qy 301 CADKXLGIESHHDFLVKVKV 320 

I I I I : I : II I : 



Db 



464 — DKKLGLTSNIHFPVSAKL 481 



RESULT 7 

US-09-270-767-42679 

; Sequence 42679, Application US/09270767 

; Patent No. 6703491 

; GENERAL INFORMATION: 

; APPLICANT: Homburger et al . 

; TITLE OF INVENTION: Nucleic acids and proteins of Drosophila melanogaster 

; FILE REFERENCE: File Reference: 7326-094 

; CURRENT APPLICATION NUMBER: US/09/270,767 

; CURRENT FILING DATE: 1999-03-17 

; NUMBER OF SEQ ID NOS : 62517 

; SOFTWARE: Patentln Ver. 2.0 

; SEQ ID^ NO 42679 

LENGTH: 561 

TYPE: PRT 

; ORGANISM: Drosophila melanogaster 
US-09-270-767-42679 

Query Match 5.9%; Score 100.5; DB 4; Length 561; 

Best Local Similarity 22.1%; Pred. No. 0.077; 

Matches 46; Conservative 29; Mismatches 94; Indels 39; Gaps 4; 

Qy 78 PQELQKVLCLVEMSEKPYI LEAALIALGNNAAYAFN RDI I RDLGGLPI VAKI LNT 132 

I : : I I I:: I I I I I I : I : P . : I : I I I I I : : : I 

Db 301 PEWQYYLSLLQSCSNPETLEAAAGAIQNLSACYWQPSIDIRATVRKEKGLPILVELLRM 360 

Qy 133 RDPIVTCEKALIVXNNLSWAENQRRLKVYMNQVCDDTITSRLNSSVQ 192 

I | | |::: |: : | :| I : 

Db 361 EVDRWCAVATALRNLAIDQRNKELIGKY AMRDLVQKLPS 400 

Qy 193 NEYQHMLANSISDFFRLFSAGNEETKLQVLKLLLNLAENPAMTRELLRAQVPSSLGSLFN 252 

II I : : | | | : | | : | | | : : I I 

Db 401 GNVQH DQNT S DDT I T AVLAT I NEVI K KNPEFSRSLLDS GGIDRLMN 446 



Qy 253 KKENKEVTLKLLVT FENINDNFKWEENE 280 

: I I : : I : I : II 

Db 447 ITKRKEKYTSCVLKFASQVLYTMWQHNE 474 



RESULT 8 

US-09-270-767-46165 

; Sequence 46165, Application US/09270767 

; Patent No. 6703491 

; GENERAL INFORMATION: 

; APPLICANT: Homburger et al . 

; TITLE OF INVENTION: Nucleic acids and proteins of Drosophila melanogaster 

; FILE REFERENCE: File Reference: 7326-094 

; CURRENT APPLICATION NUMBER: US/09/270,767 

; CURRENT FILING DATE: 1999-03-17 

; NUMBER OF SEQ ID NOS: 62517 

; SOFTWARE: Patentln Ver. 2.0 

; SEQ ID NO 46165 

LENGTH: 672 

TYPE: PRT 



; ORGANISM: Drosophila melanogaster 
US-09-270-767-46165 



Query Match 5.9%; Score 100.5; DB 4; Length 672; 

Best Local Similarity 22.1%; Pred. No. 0.1; 

Matches 46; Conservative 29; Mismatches 94; Indels 39; Gaps 4; 

Qy 78 PQELQKVLCLVEMSEKPYILEAALIALGNNAAYAFN RDIIRDLGGLPIVAKILNT 132 

I : : I II:: I I I I I I : I : I : I : I I I I I : : : I 

Db 326 PEWQYYLSLLQSCSNPETLEAAAGAIQNLSACYWQPSIDIRATWKEKGLPILVELLRM 385 

Qy 133 RDPIVT<EKALIVLNNLSWAENQRRLKWMNQVCDD 192 

I | ||::: |: : | :| I : 

Db 386 EVD RWCAVAT AL RN LAI DQRNKELIGKY AMRDLVQKLPS 425 

Qy 193 NEYQHMLANSISDFFRLFSAGNEETKLQVXKLLLNLAENPAMTRELLRAQVPSSLGSLFN 252 

II I : : I I I : I I : I I I : : I I 

Db 426 GNVQHDQNTSDDTITAVLATINEVTK KNPEFSRSLLDS GGIDRLMN 471 

Qy 253 KKENKEVTLKLLVTFENINDNFKWEENE 280 

: I I : : I : I : II 

Db 472 ITKRKEKYTSCVLKFASQVLYTMWQHNE 499 



RESULT 9 

US-08-290-731C-14 

; Sequence 14 , Application US/08290731C 
; Patent No. 5843646 
; GENERAL INFORMATION: 

; APPLICANT: BOWTELL, David Douglas Lawrence 

TITLE OF INVENTION: DNA MOLECULES ENCODING MURINE 
TITLE OF INVENTION: SON OF SEVENLESS (mSOS) GENE, 
TITLE OF INVENTION: AND mSOS POLYPEPTIDES 
; NUMBER OF SEQUENCES: 15 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: SUGHRUE, MION, ZINN, MACPEAK & SEAS 
STREET: 2100 PENNSYLVANIA AVENUE, N.W. 
CITY: WASHINGTON 
; STATE: D.C. 

COUNTRY: USA 
ZIP : 20037 
COMPUTER READABLE FORM: 
; MEDIUM TYPE: Floppy disk 

; COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 
SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/2 90, 731C 
FILING DATE: 17-OCT-1994 
CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: PCT/AU93/00068 
; FILING DATE: 17-FEB-1993 

PRIOR APPLICATION DATA: 

APPLICATION NUMBER: PL0921/92 
FILING DATE: 17-FEB-1992 
ATTORNEY/AGENT INFORMATION: 



NAME: KIT, Gordon 
REGISTRATION NUMBER: 30,764 
REFERENCE /DOCKET NUMBER: Q- 3 60 66 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (202) 293-7060 
; TELEFAX: (202) 293-7860 

TELEX: 6491103 
; INFORMATION FOR SEQ ID NO: 14: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 402 amino acids 
; TYPE: amino acid 

TOPOLOGY: linear 
MOLECULE TYPE: protein 
US-08-290-731C-14 



Query Match 5.8%; Score 100; DB 2; Length 402; 

Best Local Similarity 22.9%; Pred. No. 0.052; 

Matches 62; Conservative 36; Mismatches 77; Indels 96; Gaps 14; 

Qy 70 NSDDTV LSPQELQKVLCLVEMSEKP YI LEAALIALGNNAAYAFNRDI I RDL 120 

III: I I : I : I I I : : I : I 
Db 155 NSPDPIIYKDELVLLLPPREIAKQLCILEFQSFSHI 190 

Qy 121 GGLPIVAKI LNTRDPIVKEK-ALIVLNNLSVN AENQRRLKV — YMNQV 165 

: : I I II I III: hill I : I I I I I I 

Db 191 SRIQFLTKIWDELNRFSP — KEKTSTFYLSNHLVNFVTETIVQEEEPRRRTNVLAYFIQV 248 

Qy 166 CD DTITSRLNS SVQLAGLRLLTNMTVT NEYQ 196 

II : I I I I I I II I I I : I : I : 

Db 249 CD.YLRELNNFASLFSIISALNSSPIHRLRKTWANLNSKTLASFELLNNLTEARKNFSNYR 308 



Qy 197 HMLANSI SDFFRLFSAGNEETKLQVLKLLLNLAENPAMTRELLRAQVPS 245 

II: : I II:: : : : I : : I I I : 

Db 309 DCLENCVLPCVPFLGVYFTD-LTFLKTGNKDN FQNMINFDKRTKVTRILNEIKKFQ 363 

Qy 24 6 SLGSLFNK-KENKEVILKLLVIFENINDNFK 275 

1:1 :ll I :|:: ::: I |: :: 
Db 364 SVGYMFNP INEVQELLNEVT S RERNTNNI YQ 394 



RESULT 10 

US-09-248-796A-16001 

; Sequence 16001, Application US/09248796A 

; Patent No. 6747137 

; GENERAL INFORMATION: 

; APPLICANT: Keith Weinstock et al 

; TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO CANDIDA 
ALBICANS 

; TITLE OF INVENTION: FOR DIAGNOSTICS AND THERAPEUTICS 
; FILE REFERENCE: 107196.132 

; CURRENT APPLICATION NUMBER: US/09/248 , 796A 

; CURRENT FILING DATE: 1999-02-12 

; PRIOR APPLICATION NUMBER: US 60/074,725 

; PRIOR FILING DATE: 1998-02-13 

; PRIOR APPLICATION NUMBER: US 60/096,409 

; PRIOR FILING DATE: 1998-08-13 

; NUMBER OF SEQ ID NOS : 28208 



; SEQ ID NO 16001 

LENGTH: 704 

TYPE: PRT 
; ORGANISM: Candida albicans 
US-09-248-796A-16001 

Query Match 5.8%; Score 100; DB 4; Length 704; 

Best Local Similarity 21.1%; Pred. No. 0.12; 

Matches 52; Conservative 55; Mismatches 96; Indels 44; Gaps 13; 

Qy 74 TVLSPQELQKVXCLVEMSEKPYILEAALIALGNNAAYAFNRDIIRD LGGLPIVAK 128 

I I I I : I I III I : : I I : : I : I : : : I I : I : : 

Db 239 TVLS-EEAQ — LCL — KPEPGLLIRAMAISIDN — PESFDWWRGFFDLMLSHIPLDSD 291 

Qy 129 ILNTR-DPIVKEKALIVLNNLSVNAEN — QRRLKVYMNQVCDDTITSRLNSSVQLAGLRL 185 

: : I I : I : : : : : : : I I I I I : I : 

Db 292 VITNRITPTDREVLIMACSKITLRKDMSLNRRLWTYF LGPETEHESLKA 340 

Qy 186 LTNMTVTNEY — QHMLANSISDFFRLFSAGNEETKLQVLKLLLNLAENPAMTRELLRAQV 243 

II III:: I : : : II I : I I I : : I :: 

Db 341 LTR TEYFKQYVEETLINGLLAMAHSDKIELKCDAFKILLPLIMDKWEIGNVLTPKL 396 

Qy 244 PSS-LGSLFNKKENKEVILKLLVTFENINDNFKW EENEPTQNQFGEGSLFF 293 

I I I :| :::::::: :|: : : | :):: :::| : | 

Db 397 FSSFLKIAYNNRDHQDLMI SASTLFDGVESIYIWSDIIGVTLSDESDEEEHEF — DWHF 454 

Qy 294 FLKEFQV 300 

I I : I I 

Db 455 VLKDFNV 4 61 



RESULT 11 
US-09-356-952-6 

; Sequence 6, Application US/09356952 

; Patent No. 6117663 

; GENERAL INFORMATION:- 

; APPLICANT: Boriack-Sjodin, Ann 

; APPLICANT: Margarit, S. M. 

; APPLICANT: Bor-Sogi, Dafna 

; APPLICANT: Cole, Philip 

; APPLICANT: Kuriyan, John 

; TITLE OF INVENTION: A CRYSTAL OF A RAS-SOS COMPLEX AND METHODS OF USE 
; TITLE OF INVENTION: THEREOF 
; FILE REFERENCE: 600-1-228N 

; CURRENT APPLICATION NUMBER: US/09/356,952 

; CURRENT FILING DATE: 1999-07-19 

; EARLIER APPLICATION NUMBER: 60/093,631 

; EARLIER FILING DATE: 1998-07-21 

; NUMBER OF SEQ ID NOS : 14 

; SOFTWARE: Patent In Ver. 2.0 

; SEQ ID NO 6 

LENGTH: 911 
; TYPE: PRT 

; ORGANISM: Schizosaccharomyces pombe 
US-09-356-952-6 



Query Match 



5.8%; Score 100; DB 3; Length 911; 



Best Local Similarity 22.9%; Pred. No. 0.19-; 

Matches 62; Conservative 36; Mismatches 77; Indels 96; Gaps 14; 



Qy 70 NSDDTV LSPQELQKVXCLVEMSEKPYILEAALIALGNNAAYAFNRDIIRDL 120 

III: I I : I : I I I : : I : I 
Db 647 NSPDPIIYKDELVLLLPPREIAKQLCILEFQSFSHI 682 

Qy 121 GGLPIVAKI LNTRDPIVKEK-ALIVLNNLSVN AENQRRLKV — YMNQV 165 

: : I I II I III: hi II I : I I I I I I 

Db 683 SRIQFLTKIWDNLNRFSP — KEKTSTFYLSNHLVNEVTETIVQEEEPRRRTNVLAYFIQV 740 

Qy 166 CD DTITSRLNS SVQLAGLRLLTNMTVT NEYQ 196 

II : I I I I I III II I : I : I : 

Db 741 CDYLRELNNFASLFSIISALNSSPIHRLRKTWANLNSKTLASFELLNNLTEARKNFSNYR 800 

Qy 197 HMLANSI SDFFRLFSAGNEETKLQVLKLLLNLAENPAMTRELLRAQVPS 245 

II: :| II:: : ::| : :|| I : 

Db 801 DCLENCVXPCVPFLGVYFTD-LTFLKTGNKDN FQNMINFDKRTKVTRILNEIKKFQ 855 

Qy 246 SLGSLFNK-KENKEVTLKLLVIFENINDNFK 275 

1:1 :|| I :|:: ::: I |: :: 
Db .856 SVGYMFNPINEVQELLNEVTSRERNTNNIYQ 886 



RESULT 12 

US-09-134-001C-3159 

; Sequence 3159, Application US/09134001C 
; Patent No. 6380370 
; GENERAL INFORMATION: 

; APPLICANT: Lynn Doucette-Stamm et al 

; TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES RELATING TO 
STAPHYLOCOCCUS 

; TITLE OF INVENTION: EPIDERMIDIS FOR DIAGNOSTICS AND THERAPEUTICS 
; FILE REFERENCE: GTC-007 

; CURRENT APPLICATION NUMBER: US/09/134 , 001C 

; CURRENT FILING DATE: 1998-08-13 

; PRIOR APPLICATION NUMBER: US 60/064 , 964 

; PRIOR FILING DATE: 1997-11-08 

; PRIOR APPLICATION NUMBER: US 60/055,779 

; PRIOR FILING DATE: 1997-08-14 

; NUMBER OF SEQ ID NOS: 5674 

; SEQ ID NO 3159 

LENGTH: 10182 

TYPE: PRT 

; ORGANISM: Staphylococcus epidermidis 
US-09-134-001C-3159 

Query Match 5.8%; Score 99.5; DB 3; Length 10182; 

Best Local Similarity 21.7%; Pred. No. 8.8; 

Matches 53; Conservative 54; Mismatches 104; Indels 33; Gaps 9; 
Qy 58 RARRAVQKRASPNSDDTVLSPQELQKVTjCLVEMSEKPYILEAALIALGNNAAYAFN 113 



Db 7876 RVKQI INQTSNP TMNPLEVERATSNVKISKDALHGERELNDNKNSKTFAVNHLDN 7930 

IVLNNLSVNAENQRRLKVYMN 163 
I :: :| I: |:| 



Qy 114 RDIIRDLGGLPIVAKILNTRDPIVKEKAL 

: : : I I : : : I : I I I I 



Db 7931 LNQAQKEALTHEIEQATIVSQVNNIYN KAKALNNDMKKLKDIVAQQDNVRQSNNYIN 7987 

Qy 164 QVCDDTITSRLNSSVQLAG — LRLLTNMTWNEYQHMLANSISDFFRLFSAGNEETKLQV 221 

: I I : I : : I : : I I : : : : : I : I : : : I : I I I I 

Db 7988 E — DSTPQNMYNDTINHAQSIIDQVANPTMSHD EIENAINNIKHAINALDGEHKLQQ 8042 

Qy 222 LKLLLNLAENPAMTRELLRAQVTSSLGSLFNKKENKEVILKLLVTFENINDNFKWEENEP 281 

I III : II :: | |: : :| : : I : :|| I I 

Db 8043 AKENANLLIN S LN DLN APQ RDAI N RLVN EAQT RE KVAEQLQ S AQALN DAMKH LRN S - 8098 

Qy 282 TQNQ 285 

III 

Db 8099 IQNQ 8102 



RESULT 13 

US-09-710-279-2964 

Sequence 2964, Application US/09710279 
Patent No. 6703492 
GENERAL INFORMATION : 
APPLICANT: KIMMERLY, WILLIAM JOHN 

TITLE OF INVENTION: STAPHYLOCOCCUS EPIDERMIDIS NUCLEIC ACIDS AND PROTEINS 
FILE REFERENCE: PU3480US 

CURRENT APPLICATION NUMBER: US/09/710,27 9 
CURRENT FILING DATE: 2000-11-09 
PRIOR APPLICATION NUMBER: 60/164,258 
PRIOR FILING DATE: 1999-11-09 
NUMBER OF SEQ ID NOS : 4472 
SOFTWARE: Patent In Ver. 2.1 
SEQ ID NO 2964 
LENGTH: 5024 
TYPE: PRT 

ORGANISM: Artificial Sequence 
FEATURE: 

OTHER INFORMATION: Description of Artificial Sequence: synthetic 
OTHER INFORMATION: amino acid sequence 
FEATURE: 

NAME/KEY: MOD_RES 
LOCATION: (5024) 
OTHER INFORMATION 
US-09-710-279-2964 



variable amino acid 



Query Match 5.7%; Score 97.5; DB 4; Length 5024; 

Best Local Similarity 21.7%; Pred. No. 4.8; 

Matches 53; Conservative 53; Mismatches 105; Indels 33; 



Gaps 



9; 



Qy 58 RARRAVQKRAS PNS DDTVLS PQELQKVLCLVEMS EKP YI LEAALI ALGNNAAYAFN 113 

I :::::: I : : I I : : : I : I : II I : : I I 

Db 2930 RVKQIINQTSNP TMNPLEVERATSNVKTSKDALHGERELNDNKNSKTFAVNHLDN 2984 

Qy 114 RDI I RDLGGLPI VAKI LNTRDPIVKEKAL IVLNNLSVNAENQRRLKVYMN 163 

: : : ' I I : : : I : I II I I : : : I I : I : I 

Db 2985 LNQAQKEALTHEIEQATIVSQVNNIYN KAKALNNDMKKLKDIVAQQDNVRQSNNYIN 3041 

Qy 164 QVCDDTITSRLNSSVQLAG— LRLLTNMTVTNEYQHMLANSISDFFRLFSAGNEETKLQV 221 

: II : I : : I : : I I : : : : : I : I : : : I : I I I I 

Db 3042 E— DSTPQNMYNDTINHAQSIIDQVANPTMSHD EIENAINNIKHAINALDGEHKLQQ 3096 



Qy 

Db 



222 LKLLI^LAENPAMTRELLRAQWSSLGSLET^KKEINKEVTLKLLVIFEININDNFKWEENEP 281 
I III : II :: | |: : :| : : | : :|| I I 
3097 AKENANLLIN S LN DLNAPQ RDAI N RLVNEAQT REKYAEQLQ S AQALN DAMKH LRN S - 3152 



Qy 282 TQNQ 285 

I I I 

Db 3153 IQNQ 3156 



RESULT 14 
US-09-538^092-643 

Sequence 643, Application US/09538092 
Patent No. 6753314 
GENERAL INFORMATION : 
APPLICANT: Giot, Loic 
APPLICANT: Mansfield, Traci A. 

TITLE OF INVENTION: Protein-Protein Complexes and Method of Using Same 
FILE REFERENCE: 15966-542 

CURRENT APPLICATION NUMBER: US/09/538,092 
CURRENT FILING DATE: 2000-03-29 
PRIOR APPLICATION NUMBER: 60/127,352 
PRIOR FILING DATE: 1999-04-01 
PRIOR APPLICATION NUMBER: 60/178,965 
PRIOR FILING DATE: 2000-02-01 
NUMBER OF SEQ ID NOS : 1387 

SOFTWARE: CuraPatSeqFormatter Version 0.9 
SEQ ID NO 643 
LENGTH: 812 
TYPE: PRT 

ORGANISM: Saccharomyces cerevisiae 
FEATURE: 

NAME/KEY: misc_feature 
LOCATION: (0) . . . (0) 

OTHER INFORMATION: Polypeptide Accession Number YMR309C 
US-09-538-092-643 

Query Match 5.7%; Score 97; DB 4; Length 812; 

Best Local Similarity 22.0%; Pred. No. 0.33; 

Matches 76; Conservative 54; Mismatches 130; Indels 86; Gaps 18; 

Qy 17 DWSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRARRAVQKRAS — PNSDDT 74 

I I I :: I : I : I I : : : I : I I I I I 

Db 97 DSSDEESDEEDGKKW KSAKEKLLDEMQDVYNKISQAENSDDW 139 

Qy 75 VLS PQELQKV-LCLVEMSEKPYILEAALIALGNNAAYAFNRDI I RDLGGLPI VAKI LNTR 133 

: I : II : : : I : II II I I : I I 

Db 140 LTI SNEFDLISRLLVRAQQQNWGTPNI FIKWAQVEDAVNNTQQADLKN-KAVARAYNTT 198 

Qy 134 DPIVKEKALIVLNNLSVNAENQRRLKVYMN — QVCDDTITSRLNSSVQLAGLRLLTNMTV 191 

II: | : M : : : I : I I : I : I I : 

Db 199 KQRVKK VSRENEDSMAKFRNDPES FDKEPTADLDI SA NGFTI 240 

Qy 192 TNEYQHMLANSISDFFR LFSAG NEETKLQVLKLLLNLAENPAMTRELLRAQ 242 

: : : I III : I I I : : : : I : I I : I I I I : I 
Db 241 SSSQGNDQAVQ-EDFFTRLQTIIDSRGKKTVNQQSLISTLEELLTVAEKP YEFIMAY 296 



Qy 

Db 



243 VPSSLGSLFNK KENKEVILKLLVIFENINDNFK WEENEPT 282 

:|| : I I : Mill: I :: : hll 

297 LTLIPSRFDASANLSYQPIDQWKSSFNDISKLLSILDQTIDTYQVNEFADPIDFIEDEPK 356 



Qy 283 QNQFGE GSLFFFLK EFQVCADKXLGI ESH-HDFLVKVK 319 

: : I I I : I I : : II I I : I I : I : : : : 

Db 357 EDSDGVKRILGSIFSFVERLDDEFM KSLLNIDPHSSDYLIRLR 399 



RESULT 15 
US-09-538-092-980 

; Sequence 980, Application US/09538092 

; Patent No. 6753314 

; GENERAL INFORMATION: 

; APPLICANT: Giot, Loic 

; APPLICANT: Mansfield, Traci A. 

; TITLE OF INVENTION: Protein-Protein Complexes and Method of Using Same 
; FILE REFERENCE: 15966-542 

; CURRENT APPLICATION NUMBER: US/09/538,092 

; CURRENT FILING DATE: 2000-03-29 

; PRIOR APPLICATION NUMBER: 60/127,352 

; PRIOR FILING DATE: 1999-04-01 

; PRIOR APPLICATION NUMBER: 60/178,965 

; PRIOR FILING DATE: 2000-02-01 

; NUMBER OF SEQ ID NOS : 1387 

; SOFTWARE: CuraPatSeqFormatter Version 0.9 
; SEQ ID NO 980 

LENGTH: 937 

TYPE: PRT 
; ORGANISM: Homo sapiens 

FEATURE: 
; NAME/KEY: misc_feature 

LOCATION: (0) . . . (0) 

OTHER INFORMATION: Polypeptide Accession Number P21851 
US-09-538-092-980 

Query Match 5.7%; Score 97; DB 4; Length 937; 

Best Local Similarity 19.8%; Pred. No. 0.41; 

Matches 44; Conservative 46; Mismatches 74; Indels 58; Gaps 10; 

Qy 86 CLVEMS EKPYI LEAALI ALGNNAAYAFNRDI I RDLGGLP I VAKI LNTRDPI VKEKALI VL 145 

II : I II: : I : : : I :: I I I : :: :|:| I : I 

Db 129 CLKD — EDPYVRKTAAVCVAK — LHDINAQMVEDQGFLDSLRDLIADSNPMWANAVAAL 184 

Qy 14 6 NNLSVNAENQRRLKVYMNQ VCDDT 169 

I I : I : I : I : I : : I : 

Db 185 SEISESHPNSNLLDLNPQNINKLLTALNECTEWGQIFILDCLSNYNPKDDREAQSIC-ER 243 

Qy 170 ITSRL NSSVQLAGLRLLTN MTVTNEYQHMLANS I SDFFRLFSAGNEETKLQVL 222 

: I I I I I : I I : : : : I : : : I : I I : : : I I : I 

Db 244 VTTPRLSHANSAVVLSAvTCvXMKFLELLPKDSDYYNM^ 303 

Qy 223 K-LLLNLAENPAMTRELLRAQVPSSLGSLFNKKENKEVILKL 263 

: : I : : I I : I : :: : I II : : I I 

Db 304 RNINLIVQKRP EILKQEI KVFFVKYNDPIYVKL 336 



Search completed: January 7, 2005, 14:51:42 
Job time : 23.1064 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2005 Compugen Ltd. 



OM protein - protein search, using sw model 

Run on: January 7, 2005, 14:33:20 ; Search time 17.1939 Seconds 

(without alignments) 
1885.849 Million cell updates/sec 



Title: 

Perfect score: 
Sequence: 



US-10-726-721A-9 
1715 

1 RGDVDDAGDCSGARYNDWSD. 



. VTCVGKFMAKLAEHMFPKSQE 337 



Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 



Searched: 



283416 seqs, 96216763 residues 



Total number of hits satisfying chosen parameters: 283416 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 

Database : PIR_79:* 
1: pirl:* 
2: pir2:* 
3: pir3:* 
4: pir4:* 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 

% 

Result Query 
. No. Score Match Length DB ID Description 



1 


781.5 


45.6 


453 


2 


JC7582 


armadillo (arm) rep 


2 


683 


39.8 


632. 


2 


T00084 


hypothetical prote.~~ 


3 


345 


20.1 


1395 


2 


T00068 


hypothetical prote 


4 


127.5 


7.4 


744 


2 


A32905 


. plakoglobin, desmo 


5 


117.5 


6.9 


578 


2 


S50446 


VAC8 protein - yea 


6 


112 


6.5 


630 


2 


G87753 


protein C43E11.8 [ 


7 


109 


6.4 


619 


2 


A36682 


72K mitochondrial 


8 


108.5 


6.3 


867 


2 


B96625 


hypothetical prote 


9 


107.5 


6.3 


876 


2 


T51951 


gamma- adaptin 1 [i 


10 


107 


6.2 


629 


2 


B64075 


transcription init 


11 


106.5 


6.2 


580 


2 


F84471 


hypothetical prote 


12 


106.5 


6.2 


729 


2 


A86416 


probable arm repea 


13 


105 


6.1 


428 


2 


T27763 


hypothetical prote 



1 A 

14 


104 


6 . 1 


A C A 

4 64 


2 


S50541 


hypothetical prote 


15 


104 


6. 1 


1830 


2 


n A ^ A A 

E82909 


conserved hypothet 


16 


103 


6. 0 


449 


2 


T26571 


hypothetical prote 


17 


103 


6. 0 


993 


2 


A96750 


hypothetical prote 


18 


103 


6 . 0 


1 O O "7 

1387 


2 


Tlooll 


hypothetical prote 


19 


101 


5 . 9 


711 


2 


F86373 


protein T23E23.12 


20 


101 


5 . 9 


1299 


2 


A86366 


T26J12.6 protein - 


21 


100. 5 


5 . 9 


251 


2 


G75063 


hypothetical prote 


22 


100. 5 


5.9 


476 


2 


T52157 


hypothetical prote 


23 


100 . 5 


5 . 9 


1979 


2 


C71622 


hypothetical prote 


24 


100 


5.8 


888 


2 


A38539 


plOl protein precu 


25 


100 


5. 8 


895 


2 


T11979 


Preprotein translo 


26 


100 


5. 8 


911 


2 


S28098 


guanine-nucleotide 


27 


100 


5. 8 


1802 


2 


G71616 


hypothetical prote 


28 


99 . 5 


5 . 8 


1046 


2 


A86790 


ATP-dependent dsDN 


29 


99 


5 . 8 


2048 


2 


C84609 


hypothetical prote 


30 


99 


5 . 8 


3066 


1 


JQ1662 


genome polyprotein 


31 


97 . 5 


5 . 7 


459 


2 


T39473 


probable geranylge 


32 


97.5 


5.7 


949 


2 


D97781 


hypothetical prote 


33 


97 . 5 


5.7 


1164 


2 


T24806 


hypothetical prote 


34 


97 


5.7 


522 


2 


A57319 


overgrown hematopo 


35 


97 


5 . 7 


618 


2 


D86364 


hypothetical prote 


36 


97 


5.7 


924 


2 


T00518 


hypothetical prote 


37 


97 


5 . 7 


937 


2 


A35553 


beta-adaptin - hum 


38 


97 


5 . 7 


937 


2 


C35553 


beta-adaptin - rat 


39 


96. 5 


5.6 


253 


2 


B71168 


hypothetical prote 


40 


96. 5 


5 . 6 


738 


2 


S35093 


plakoglobin - Afri 


41 


96 


5 . 6 


966 


2 


D96662 


hypothetical prote 


42 


95.5 


5.6 


868 


2 


AE1953 


hypothetical prote 


43 


95 


5.5 


372 


2 


C83766 


adenine glycosylas 


44 


95 


5.5 


511 


2 


E90600 


hypothetical prote 


45 


95 


5.5 


865 


2 


T41685 


probable gamma-ada 



ALIGNMENTS 



RESULT 1 
JC7582 

armadillo (arm) repeat protein ALEX1 - human 
C; Species: Homo sapiens (man) 

C;Date: 30-Jun-2001 #sequence_revision 30-Jun-2001 #text_change 09-Jul-2004 
C; Accession: JC7582 

R;Kurochkin, I.V. ; Yonemitsu, N.; Funahashi, S.; Nomura, H. 
Biochem. Biophys. Res. Commun. 280, 340-347, 2001 

A; Title: ALEX1, a novel human armadillo repeat protein that is expressed 

differentially in normal tissues and carcinomas. 

A; Reference number: JC7582; MUID: 21092608; PMID: 11162520 

A;Accession: JC7582 

A; Molecule type: mRNA 

A; Residues: 1-453 <KUR> 

A;Cross-references: UNIPROT :Q9P291 ; DDBJ: AB039670 

C; Comment: This protein is involved in regulation of normal cell growth, cell- 
to-cell signaling or in establishment of cell polarity, and such plays a role in 
tumor suppression. 
C; Genetics : 
A; Gene: alexl 



A; Map position: Xq21 . 33-q22 . 2 

C;Keywords: tandem repeat; transmembrane protein 



Query Match 45.6%; Score 781.5; DB 2; Length 453; 

Best Local Similarity 52.3%; Pred. No. 1.4e-49; 

Matches 157; Conservative 56; Mismatches 74; Indels 13; Gaps 3; 

Qy 39 RIGTEAGTRA RARARARATRA RRAVQKRAS PNSDDTVLS PQELQKVLCL 87 

Ihllll : :||:::||| || | | I :|| :||||| : 

Db 156 RSGSRAGGRASGKSKGKARSKSTRAPATTWPVRRG — KFNFPYKIDDILSAPDLQKVLNI 213 

Qy 88 VIMSEKPYILEAALIALGNNAAYAFNRDIIRDLGGLPIVAKII^ 147 

: I : I : I I I I : I I I I I I I : I I :: I I : I I I : I I : I I : : I : I I I :: I I III 
Db 214 LERTNDPFIQEVALWLGNNAAYSFNQNAIRELGGVPIIAKLIKTKDPIIREKTYNALNN 273 

Qy 148 LSWAENQRRLKVTMNQVCDDTITSRLNSSV^ 207 

I I I II I I I : : I I : : I I I I I I : I I : I : I I : I I I I I I I I I I I I I I I I : I : I III 
Db 274 LSVNAENQGKIKTYISQVCDDTMVCRLDSAVQMAGLRLLTNMTVTNHYQHLLSYSFPDFF 333 

Qy 208 RLFSAGNEETKLQ VXKLLLNLAENPAMTRELLRAQVPS S LGSLFNKKENKEVT LKLLVT F 267 

I II I I : I : : I I : : I I I I I I I I I I : : I I I I Mill: : : I : : I : I : I 
Db 334 ALLFLGNHFTKIQIMKLIINFTENPAMTRELVSCKVPSELISLFNKEWDREILLNILTLF 393 

Qy 268 ENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVCADKXLGIESHHDFLVTCVKVGKFMAKL 327 

I I I I I I I I : : : I I I I I II II I : : I : I : I I I I I I : I I 
Db 394 ENINDNIKNEGLASSRKEFSRSSLFFLFKESGVCVT(KIKAIjA^ 453 



RESULT 2 
T00084 

hypothetical protein KIAA0512 - human 
C; Species: Homo sapiens (man) 

C;Date: 22-Jan-1999 #sequence_revision 22-Jan-1999 #text_change 09-Jul-2004 
C; Accession: T00084 

R;Nagase, T.; Ishikawa, K. ; Miyajima, N . ; Tanaka, A.; Kotani, H. ; Nomura, N.; 
Ohara, O. 

DNA Res. 5, 31-39, 1998 

A; Title: Prediction of the coding sequences of unidentified human genes. IX. The 
complete sequences of 100 new cDNA clones from brain which can code for large 
proteins in vitro. 

A; Reference number: Z14086; MUID: 98290545; PMID: 9628581 
A; Accession: T00084 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 
A; Residues: 1-632 <NAG> 

A; Cross-references: UNIPROT : 060267 ; EMBL: AB011084 ; NID: g3043547;^. 

PIDN:BAA25438.1; PID:g3043548 

A; Experimental source: brain; clone HF0239 

C; Genetics : 

A; Note: KIAA0512 

Query Match 39.8%; Score 683; DB 2; Length 632; 

Best Local Similarity 45.3%; Pred. No. 3.4e-42; 

Matches 140; Conservative 56; Mismatches 91; Indels 22; Gaps 3; 



Qy 



18 WSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRARRAVQKRASPNSDDTVLS 77 
I : I : I I : : I I : I I I hill I I : I 



Db 



345 WTDTESDSD 



S EPETQRRGRGRRPV AMQKRP FPYEIDEILG 384 



Qy 78 PQELQKVLCLVEMSEKPYILEAALIALGNNAAYAFNRDIIRDLGGLPIVAKILNTRDPIV 137 

::|:IM |:: |: |:| : ||: I II! |: |:: II lllllhl ::| II : 
Db 385 VRDLRKVLALLQKSDDPFIQQV/UiLTLSNNANYSCNQETIRKLGGLPIIANMINKTDPHI 444 

Qy 138 KEKALIVTJWLSWAENQRRLKVYMNQVCDDTIT^ 197 

Mill: : I I I I I III Ihlllhl II : I I I I : I I : I I : I I I I I : I I : I I I 
Db 445 KEKALMAMNNLSENYENQGRLQVYMNKVMDDIMASNLNSAVQWGLKF 504 

Qy 198 MLANSISDFFRLFSAGNEETKLQVXKLLL^ 257 

: I II I :: I I I I I I : I : : : I I : I I I I I I I : : I I I I I : I I I : I 
Db 505 LLVNS IAN FFRLLSQGGGKIKVEILKILSNFAENPDMLKKLLSTQVPASFSSLYNS YVES 564 

Qy 258 EVTLKLLVIFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC7VDKXLGIESHHDFLVK 317 

I : : : I : I I I I I : I : I : I I I I : II I :: I I I I I I 

Db 565 EILINALTLFEIIYDNLRAEVF — NYREFNKGSLFYLCTTSGVCVKKIRALANHHDLLVK 622 

Qy 318 VKVGKFMAK 326 

I I I I : I 
Db 623 VKVTKLVNK 631 



RESULT 3 
T00068 

hypothetical protein KIAA0443 - human 
C; Species: Homo sapiens (man) 

C;Date: 22-Jan-1999 #sequence_revision 22-Jan-1999 #text_change 09-Jul-2004 
C;Accession: T00068 

R;Ishikawa, K. ; Nagase, T.; Nakajima, D. ; Seki, N. ; Ohira, M. ; Miyajima, N. ; 
Tanaka, A.; Kotani, H.; Nomura, N . ; Ohara, O. 
DNA Res. 4, 307-313, 1997 

A; Title: Prediction of the coding sequences of unidentified human genes. VIII. 
78 new cDNA clones from brain which code for large proteins in vitro. 
A; Reference number: Z14084; MUID: 98116655; PMID: 9455477 
A; Accession: T00068 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 
A; Residues: 1-1395 <ISH> 

A;Cross-references: UNIPROT : 043168; EMBL:AB007903; NID:dll75359; 

PIDN:BAA23715.1; PID:dl024620 

A; Experimental source: brain; clone HJ0137 

C; Genetics : 

A; Note: KIAA0443 

Query Match 20.1%; Score 345; DB 2; Length 1395; 

Best Local Similarity 36.1%; Pred. No. 4.7e-17; 

Matches 73; Conservative 49; Mismatches 80; Indels 0; Gaps 0; 

Qy 76 LS PQELQKVLCLVEMS EKP YI LEAALI ALGNNAAYAFNRDI I RDLGGLPI VAKI LNTRDP 135 

: : I : : : I I : I I : I I : I I : I : I I I I I I I I : : : : I I 

Db 1151 IGSEEFEELLLLMEKIRDPFIHEISKIAMGMRSASQFTRDFIRDSGWSLIETLLNYPSS 1210 

Qy 136 IVT<EKALIVLNNLSWAENQRRLKVTMNQVC^ 195 



Db 1211 RVRTS FLENMI RMAPP YPNLNI IQTYICKVCEETLAYSVDS PEQLSGI RMI RHLTTTTDY 1270 



Qy 196 QHMLANSISDFFRLFSAGNEETKLQVLKLLIJ^LAENPAMTRELLRAQVPSSLGSLFNKK^ 255 

: : I I : I I I : II :|: llhlllhll I I : I I I I : I lll::| 
Db 1271 HTLVANYMSGFLSLLATGNAKTRFHVLKMLLNLSENLFMTKELLSAEAVSEFIGLFNREE 1330 

Qy 256 NKEVT LKLLVI FENINDNFKWE 277 

: I : I I I I I I : I I I 
Db 1331 TN DN I Q I VLAI FEN I GNN I KKE 1352 



RESULT 4 
A32905 

plakoglobin, desmosomal - human 
C; Species: Homo sapiens (man) 

C;Date: 22-Nov-1989 #sequence_revision 22-Nov-1989 #text_change 09-Jul-2004 
C;Accession: A32905 

R;Franke, W.W.; Golds chmidt, M.D.; Zimbelmann, R. ; Mueller, H.M.; Schiller, 
D.L.; Cowin, P. 

Proc. Natl. Acad. Sci. U.S.A. 86, 4027-4031, 1989 

A; Title: Molecular cloning and amino acid sequence of human plakoglobin, the 
common junctional plaque protein. 

A;Reference number: A32905; MUID: 89264555 ; PMID:2726765 
A; Accession: A32905 
A; Status : preliminary 
A;Molecule type: mRNA 
A; Residues: 1-744 <FRA> 

A; Cross-references: UNIPROT : P14923; GB:M23410 

C; Genetics: 

A; Gene: GDB: JUP 

A; Cross-references: GDB: 126565; OMIM: 173325 
A; Map position: 7pter-7qter 
C; Keywords: cytoskeleton 

Query Match 7.4%; Score 127.5; DB 2; Length 744; 

Best Local Similarity 19.9%; Pred. No. 0.16; 

Matches 77; Conservative 49; Mismatches 122; Indels 139; Gaps 10; 

Qy 54 ARATRARRAVQKRAS PNS DDTVLS PQELQKVLCLVEMSEKP YI LEAALI ALGNNAAYAFN 113 

Illlll: I :|: I : I ::| : I : : I : : | I | : 
Db 80 ARAKRWEAMCPGVSGEGQ1ALLATQVEGQATNLQRLAEPSQLLKSAIVHLIN YQDD 136 

Qy 114 RDIIRDLGGLPIVAKILNTRDPIVKEKALIVLNNLSVNAENQRRLK VYMNQ 164 

::: I I : I : I I I I : I I I : : : I I I ::| I I I 

Db 137 AELV — TRALPELTKLLNDEDPVVVTKAAMIWQLSKKEASRRALMGSPQLVAAVVRTMQ 194 

Qy 165 VCDDTITSRLNSSV 178 

| |:| :|: 

Db 195 NTSDLDTARCTTS I LHNLSHHREGLLAI FKSGGI PAL VRMLS S PVESVLFYAITTLHNLL 254 

Qy 179 QLAGLRLLTNMTVTNEYQHMLANSISDFFRLFSAGNEETKL 219 

III : : I : : I : I : I I : I : I I 

Db 255 LYQEGAKMACAGRRAQKMVPLLNKNNPKFLAITTDCLQLLAYGNQESKLIILANGGPQAL 314 

Qy 220 QVLKLLLNLAEN-PAMTR ELLRAQVP 244 

: I I I : I III: : I : I 

Db 315 VQIMRNYSYEKLLWTTSRVXKVLSVCPSNKPAIVEAGGMQALGKHLTSNSPRLVQNCLWT 374 



Qy 



245 - S SLGSLFNKKENKEVI LKLLV 1 FEN I N DN FKWEEN E PTQNQ FG 287 



: I : I : I I : I I : I I I : I : III 

Db 375 LRNLSDVATKQEGLESVLKILWQLSVDDVimiTCATGTLSNLTC^SKNKTLVTQNSGV 434 

Qy 288 EGSLFFFLK EFQVCADKXL 306 

I : I : I I I I : I 

Db 435 EALIHAILRAGDKDDITEPAVCALRHL 461 



RESULT 5 
S50446 

VAC 8 protein - yeast ( Saccharomyces cerevisiae) 
N;Alternate names: protein YEL013w 
C; Species : Saccharomyces cerevisiae 

C;Date: 28-May-1993 #sequence_revision 24-Feb-1995 #text_change 09-Jul-2004 
C;Accession: S50446 
R; Dietrich, F.S. 

submitted to the EMBL Data Library, December 1994 

A; Description: Saccharomyces cerevisiae chromosome V cosmids 9871, 8199, 9867, 

9495 and lambda clones 6693 and 5898. 

A; Reference number: S50428 

A;Accession: S50446 

A; Molecule type: DNA 

A; Residues: 1-578 <DIE> 

A; Cross-references: UNIPROT: P39968; EMBL:U18530; NID: g602367 ; PID:g602380; 
GSPDB:GN00005; MIPS:YEL013w 
C; Genetics: 

A; Gene: SGD:VAC8; MIPS:YEL013w 

A; Cross-references: SGD: S0000739 ; MIPS:YEL013w 
A; Map position: 5L 
C; Function: 

A; Description: required for vacuole inheritance and protein targeting from the 

cytoplasm to vacuole 

C; Keywords: yeast vacuole 

Query Match 6.9%; Score 117.5; DB 2; Length 578; 

Best Local Similarity 24.2%; Pred. No. 0.63; 

Matches 59; Conservative 44; Mismatches 88; Indels 53; Gaps 9; 

Qy 76 LS PQELQKVLCLVEMS EKP YI LEAALI ALGNNAAYAFNRDI I RDLGGL- PI VAKI 129 

: I : I : : I I :': I : I I II I I I I I I : : I : : I I I I : : : : 

Db 82 VSREVXEPILILLQ-SQDPQIQVAACAALGNLAWNENKLLIV^IMGGLEPLINQMMGDW 140 

Qy 130 LNTRD PI VKEKALIVLNNLSVNAEN 154 

I I I I I : I : I I I : : : I I 

Db 141 EVQCNAVGC I TNLAT RDDN KH KI AT S GAL I P LT KLAKS KH I RVQRN AT GAL LNMT H S EEN 200 

Qy 155 QRRLKVYMNQVCDDTITSRLNSS VQIAGLRLLTNMTVTNEYQHMLANS ISDFF 207 

: : I : I : I I : I : II I : I : I : I I : : I 

Db 201 RKEL WAGAVPVXVS LLS ST DP DVQY YCTTAL SN I AVDEANRKKLAQTE P RLVS KLV 257 

Qy 208 RLFSAGNEETKLQVXKLLLNLAENPAMTRELLRAQVPSSLGSLFNKKENKEVTLKLLVTF 267 

I : : II INI:: I : : I I I I : : : : : I : 

Db 258 SLMDSPSSRVKCQATLALRNLASDTSYQLEIVTWSGLPHLv^ 316 



Qy 

Db 



268 ENIN 271 
I I : 

317 RNIS 320 



RESULT 6 
G87753 

protein C43E11.8 [imported] - Caenorhabditis elegans 
C; Species: Caenorhabditis elegans 

C;Date: 10-May-2001 #sequence_revision 10-May-2001 #text_change 09-Jul-2004 
C; Access ion: G87753 

R; anonymous, The C. elegans Sequencing Consortium. 
Science 282, 2012-2018, 1998 

A; Title: Genome sequence of the nematode C. elegans: a platform for 
investigating biology. 

A; Reference number: A75000; MUID: 99069613; PMID: 9851916 
A; Note: see websites genome.wustl.edu/gsc/C_elegans/ and 
www_sanger.ac.uk/Projects/C_elegans/ for a list of authors 

A;Note: published errata appeared in Science 283, 35, 1999; Science 283, 2103, 

1999; and Science 285, 1493, 1999 

A; Accession : G87753 

A; Status : preliminary 

A; Molecule type: DNA 

A; Residues: 1-630 <ST0> 

A; Cross-references: UNIPROT: P91149; GB:chr_I; PIDN : AAB37623 . 1; PID : gl703569 ; 

GSPDB:GN00019; CESP : C43E11 . 8 

C; Genetics: 

A; Gene: C43E11.8 

A; Map position: 1 

Query Match 6*5%; Score 112; DB 2; Length 630; 

Best Local Similarity 22.6%; Pred. No. 1.8; 

Matches 78; Conservative 54; Mismatches 109; Indels 104; Gaps 18; 

Qy 43 EAGTR/^RARARAR-ATRARRAVQKRASPNSDDTVLSPQELQKVLC LVEMSEKPYI 96 

II : : : I : : : I I : I I : I I II : : I I : I : I I : 

Db 248 EAQKSSQLASRSKLSTAVRKPVQR — SEKVD — VLIDLDACHAMCSALLSLLELEEK — L 301 

Qy 97 LEAALIALGNNAAYAFNRDIIRDLGGLPIVAKILNTRDPIVKEKALIVL 145 

: I : I : I : I I : : : I : : I I I : : : 

Db 302 MVKAI PDTSKRA QVFRELVSRPLAYAWQTQ- KWNEKDI GI VPLLPLLHLLSQ 354 

Qy 146 NNLSVNA ENQRRLKV YMNQVCDD TITS 172 

: I I : I : I : I : I I : I : I : : II 

Db 355 N YARFHNLATN S I GDVQ FD S LMRQLQVKC S S YVNEVT ENLNEDTT KFVP PDGNVH PTTAS 414 

Qy 173 RLNSSVQLAGLRLLTNMTWNEYQHMLANSISDFFRLFSAGNEETKLQVXKLLLNLAENP 232 

II I I : I I I I I : I I I : I I : I I 
Db 415 TLNFLSSLTAHR VTVT QHVLA LTAPQGSNTNLLLPKLF 452 

Qy 233 AMTRELLRAQVPS SLGSLFNKKEN — KEVI LKLLVI FENINDNFKWEENE PTQNQ 285 

I : : I : I I I : I I I : I : : I I I : I I : 

Db 453 — ARILSALGSMLKKKANLYDDPTLATIFLLNNYNYIAKTLADEQDGLLPAITE 504 

Qy 286 FGEGSLFFFLKEFQVCADKXL— — GIESHHDFLVKVKVGKFMAK 326 

I I: :| I :: I III : :: I I I I 

Db 505 MNSNI LS FYHEEI ATCTNEYLKSWNGIASI LKSVDRI GEDKQMAK 549 



RESULT 7 



A36682 

72K mitochondrial outer membrane protein - Neurospora crassa 
C; Species: Neurospora crassa 

C;Date: 12-Apr-1991 #sequence_revision 12-Apr-1991 #text_change 09-Jul-2004 
C; Accession: A36682 

R;Steger, H.F.; Soellner, T. ; Kiebler, M. ; Dietmeier, K.A. ; Pfaller, R. ; 
Truelzsch, K.S.; Tropschug, M. ; Neupert, W. ; Pfanner, N. 
J. Cell Biol. Ill, 2353-2363, 1990 

A;Title: Import of ADP/ATP carrier into mitochondria: two receptors act in 
parallel. 

A;Reference number: A36682; MUID: 91115930; PMID:2177474 
A;Accession: A36682 
A; Status: preliminary 
A;Molecule type: mRNA 
A; Residues: 1-619 <STE> 

A; Cross-references: UNIPROT: P23231; GB:X53735; NID:g3027; PIDN : CAA37767 . 1 ; 
PID:g3028 

C; Superf amily : mitochondrial outer membrane protein, 70K; tetratricopeptide 
repeat homology 

C; Keywords: membrane protein; mitochondrion 

F; 131-163/Domain: tetratricopeptide repeat homology #status atypical <TT1> 

F; 164-197/Domain: tetratricopeptide repeat homology <TT2> 

F; 1 98-23 1/Domain : tetratricopeptide repeat homology <TT3> 

F; 335-368/Domain: tetratricopeptide repeat homology <TT4> 

F; 369-402/Domain: tetratricopeptide repeat homology <TT5> 

F; 4 03-4 3 6/ Domain : tetratricopeptide repeat homology <TT6> 

F; 512-545/Domain: tetratricopeptide repeat homology <TT7> 

F; 546-579/Domain: tetratricopeptide repeat homology <TT8> 

Query Match 6.4%; Score 109; DB 2; Length 619; 

Best Local Similarity 25.2%; Pred. No. 2.9; 

Matches 86; Conservative 40; Mismatches 129; Indels 86; Gaps 18; 

Qy 31 IVWYPPWARIGTEAG T RARARARARAT RARRAVQ KRAS PN S D D — TVLSPQEL 81 

:|:| 1:1 :: I I :| :| : : |:| I I II 

Db 56 VVYYLRKGSEQKESGPKLSKKERRKRKQAEKASTSKTEEAAPTQPKAAAVESADELPEID 115 

Qy 82 -QKVXCLVEMSEKPYILEAALIALGNNA — AYAFNRDIIRDLGGLPIVAKILNTRDPIVK 138 

: I : I I II II I II I : I I : I II I : I I I : 

Db 116 EESWRLSEDERKAY — AAKLKELGNKAYGSKDFNKAI — DLYSKAIICK PDPVYY 167 

Qy 139 EKALIVLNNLS VNAENQRRLKVYMNQVCDDTIT S RLNS SVQLAGLRLLTNMTVTNE 194 

II: II: I I : : I : II: II : 
Db 168 SNRAACHNALAQWEQWADTTAALKLDPHYV — KALNRRANAYDQL — SR 213 

Qy 195 YQHML ANSISDFFRLFSAGNEETKLQVLKLLLNLAENPAMTRELLRAQVPSSLGSL 250 

I : I I hllll II:: I : I I I I I I : I : I : I I 

Db 214 YRHALLDFTASCIIDGFR NEQSAQAVERLLKKFAENKA — KEILETKPPKLPSST 266 

Qy 251 FNKKENKEVILKLLVIF ENINDNFKWEENEPTQNQFGEGSLFFFLKE 297 

I : I I | : | : : | : I I I I I 

Db 267 F VGNYLQSFRSKPRPEGLEDSVELSE ETGLGQLQLGLKHLESKTGT 312 

Qy 298 FQVCADKXLGI ESHHDFLVKVKVGKFMAKLAEH 330 

I : III I : I II : : I 

Db 313 GYEEGSAAFKKALD— LGELGPHEALAYNLRGTFHCLMGKH 351 



RESULT 8 
B96625 

hypothetical protein T2K10.12 [imported] - Arabidopsis thaliana 
C; Species: Arabidopsis thaliana (mouse-ear cress) 

C;Date: 02-Mar-2001 #sequence_revision 02-Mar-2001 #text_change 09-Jul-2004 
C; Accession: B96625 

R;Theologis, A.; Ecker, J.R.; Palm, C.J.; Federspiel, N.A.; Kaul, S.; White, O. 
Alonso, J.; Altaf, H.; Araujo, R. ; Bowman, C.L.; Brooks, S.Y.; Buehler, E. ; 
Chan, A.; Chao, Q.; Chen, H.; Cheuk, R.F.; Chin, CW. ; Chung, M.K.; Conn, L . ; 
Conway, A.B.; Conway, A.R.; Creasy, T.H.; Dewar, K.; Dunn, P.; Etgu, P.; 
Feldblyum, T.V. ; Feng, J.; Fong, B. ; Fujii, C.Y.; Gill, J.E.; Goldsmith, A.D.; 
Haas, B.; Hansen, N.F.; Hughes, B.; Huizar, *L. 
Nature 408, 816-820, 2000 

A;Authors: Hunter, J.L.; Jenkins, J.; Johnson-Hopson, C; Khan, S.; Khaykin, E. 
Kim, C.J.; Koo, H.L.; Kremenetskaia, I.; Kurtz, D.B.; Kwan, A.; Lam, B.; Langin 
Hooper, S.; Lee, A.; Lee, J.M.; Lenz, CA. ; Li, J.H.; Li, Y. ; Lin, X.; Liu, 
S.X.; Liu, Z.A. ; Luros, J.S.; Maiti, R. ; Marziali, A.; Militscher, J.; Miranda, 
M. ; Nguyen, M. ; Nierman, W.C; Osborne, B.I.; Pai, G.; Peterson, J.; Pham, P.K. 
Rizzo, M. ; Rooney, T.; Rowley, D. ; Sakano, H. 

A;Authors: Salzberg, S.L.; Schwartz, J.R.; Shinn, P.; Southwick, A.M.; Sun, H.; 
Tallon, L.J.; Tambunga, G. ; Toriumi, M.J.; Town, CD.; Utterback, T.; van Aken, 
S.; Vaysberg, M. ; Vysotskaia, V.S.; Walker, M. ; Wu, D. ; Yu, G. ; Fraser, CM. ; 
Venter, J.C; Davis, R.W. 

A; Title: Sequence and analysis of chromosome 1 of the plant Arabidopsis. 

A; Reference number: A86141; MUID: 21016719; PMID: 11130712 

A; Accession: B96625 

A; Status: preliminary 

A; Molecule type: DNA 

A; Residues: 1-867 <STO> 

A;Cross-references: UNIPROT: Q9ZUI6; GB:AE005173; NID: g4249386; PIDN : AAD14483 . 1 ; 

GSPDB:GN00141 

C;Genetics : 

A; Gene: T2K10.12 

A;Map position: 1 

Query Match 6.3%; Score 108.5; DB 2; Length 867; 

Best Local Similarity 22.5%; Pred. No. 4.8; 

Matches 73; Conservative 39; Mismatches 96; Indels 117; Gaps 15; 

Qy 41 GT EAGT RARA- RARARAT RARRAVQ K RASPNSDD 73 

II I I I I I I I : I I I I I : I 

Db 7 GTRLSDMIRAIRASKT7\AEERAVVRKECAAIRASINENDQDYRHRDLAKLMFIHMLGYPT 66 

Qy 74 TVLSP QELQKVLCLVEMSEK PYILEAAL 101 

: I I Ihllll II II: II 

Db 67 HFGQMECLKLIAS PGFPEKRI GYLGLMLLLDERQEVLMLVTNSLKQDLNHTNQYI VGLAL 126 

Qy 102 IALGNNAAYAFNRDIIRDLGGLPIVAKILNTRDPIVKEKALI VLNN L S VNAENQ RR 157 

I I I I : II: I I : : I I I I : : : I I : : : : : I I 
Db 127 CALGN I C S AEMARDL APEVERLLQFRDPNIRKKAALCAIRIIRKVPDLSEN 177 

Qy 158 LKVYMNQVCDDTITSRLNSSVQLAGLRLLTNMTVTN EYQHM LAN 201 

: : I : : | : | : | | : : | | : | | 

Db 178 FINP — GAALLKEKHHGVLITGVHLCTEICKVSSEALEYFRKKCTEGLVKTLRDIAN 232 



Qy 



202 S 



ISDFFRLFSAGNEETKLQVLKLLLNLAENPAMTRELLR AQVPSSLG 248 



I I : I I :::IMI I : I : :. Ill I 

Db 233 SPYSPEYDVAGITDPF LH I RLLKLLRVLGQGDADAS DCMN DI LAQVAS KT E 283 

Qy 249 SLFNKKENKEVTLKLLVT FENINDN 273 

I II :: : : : I : I 

Db 284 S — NKNAGNAI LYECVQT IMS I EEN 306 



RESULT 9 
T51951 

gamma- a dap tin 1 [imported] - Arabidopsis thaliana 
C; Species: Arabidopsis thaliana (mouse-ear cress) 

C;Date: 20-Oct-2000 #sequence_revision 20-Oct-2000 #text_change 09-Jul-2004 
C;Accession: T51951 

R; Schledzewski, K. ; LaBrie, S.T.; Crawford, N.M. ; Brinkmann, H.; Medel, R.R. 
submitted to the EMBL Data Library, April 1998 

A; Description: Sequencing of the Arabidopsis thaliana EST clone 203C19T7 
(Accession H77083) that is homologous to gamma-adaptin from mouse and Ustilago 
maydis . 

A; Reference number: Z25886 
A; Accession: T51951 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 
A; Residues: 1-876 <SCH> 

A;Cross-references: UNI PROT : 081227; EMBL: AF061286; PIDN: AAC28338 . 1 
A; Experimental source: cultivar Columbia 

Query Match 6.3%; Score 107.5; DB 2; Length 876; 

Best Local Similarity 22.7%; Pred. No. 5.8; 

Matches 73; Conservative 35; Mismatches 99; Indels 115; Gaps 14; 

Qy 44 AGTRARARARA RAT RARRAVQ K RASPNSDD 73 

: I I I I II I I I : I I I I I I 

Db 6 SGTRLRDMI RAI RACKTAAEERAVVRKECADI RALINEDDPHDRHRNLAKLMFIHMLGYP 65 

Qy 74 TVLSP QELQKVLCLVEMSEK PYILEAA 100 

: I I Ihllllll I :: I 

Db 66 THFGQMECLKLIAiSPGFPEKRIGYLGl^LLLDERQEVXMLW 125 

Qy 101 LIALGNNAAYAFNRDI I RDLGGLPIVAKILNTRDPI VKEKALI VLNNL SV 150 

Mill: II: | | : : : | | | : : : | | : : I 

Db 126 LCALGNICSAEMARDL APEVERLIQFRDPNIRKKAALCSTRIIRKVPDLAENFV 179 

Qy 151 NA ENQRRLKVYMNQVCDDTITSRLNSSV QLAGLRLLTNMTVTN 193 

II I : : I : I : | : | : I I : I I 

Db 180 NAAASLLKEKHHGVLITGVQLCYELCT — INDEALEYFRTKCTEGLIKTLRDITNSAYQP 237 

Qy 194 EYQHMLANSISDFFRLFSAGNEETKLQVLKLLLNLAENPAMTRELLR AQVPSSLGSL 250 

II I : I I : : : | : | | | : | : | : M I : | 

Db 238 EYD VAGITDPF LH I RL L RLL RVLGQGDADAS DLMT D I LAQVAT KT E S - 284 

Qy 251 FNKKENKEVTLKLLVI FENIND 272 

II I : : : II 

Db 285 -NKNAGNAVLYECVETIMAI ED 305 



RESULT 10 



B64075 

transcription initiation factor sigma 70 - Haemophilus influenzae (strain Rd 
KW20) 

C; Species: Haemophilus influenzae 

C;Date: 18-Aug-1995 #sequence_revision 18-Aug-1995 #text_change 09-Jul-2004 
C;Accession: B64075 

R; Fleischmann, R.D.; Adams, M.D.; White, O. ; Clayton, R. A. ; Kirkness, E.F.; 
Kerlavage, A.R.; Bult, C.J.; Tomb, J.F.; Dougherty, B.A. ; Merrick, J.M. ; 
McKenney, K. ; Sutton, G. ; FitzHugh, W. ; Fields, C; Gocayne, J.D.; Scott, J. ; 
Shirley, R. ; Liu, L.I.; Glodek, A.; Kelley, J.M. ; Weidman, J.F.; Phillips, C.A. ; 
Spriggs, T.; Hedblom, E.; Cotton, M.D.; Utterback, T.R.; Hanna, M.C.; Nguyen, 
D.T.; Saudek, D.M.; Brandon, R.C.; Fine, L.D.; Fritchman, J.L.; Fuhrmann, J.L.; 
Geoghagen, N.S.M. 
Science 269, 496-512, 1995 

A;Authors: Gnehm, C.L.; McDonald, L.A. ; Small, K.V. ; Fraser, CM.; Smith, H.O. ; 
Venter, J.C. 

A; Title: Whole-genome random sequencing and assembly of Haemophilus influenzae 
Rd. 

A;Reference number: A64000; MUID: 95350630 ; PMID:7542800 
A; Accession: B64075 

A; Status: nucleic acid sequence not shown; translation not shown 

A; Molecule type: DNA 

A; Residues: 1-629 <TIGR> 

A; Cross-references: UNIPROT : P43766; GB:U32735; GB:L42023; NID: gl573509; 
PIDN:AAC22190. 1; PID : gl573517 ; TIGR:HI0533 

C; Super family : transcription initiation factor sigma 70; transcription 
initiation factor sigma katF homology; transcription initiation factor sigma 
region 1 homology 

C; Keywords: DNA binding; sigma factor; transcription initiation 
F;l-143/Domain: transcription initiation factor sigma region 1 homology <SR1> 
F;397-623/Domain: transcription initiation factor sigma katF homology <KTF> 

Query Match 6.2%; Score 107; DB 2; Length 629; 

Best Local Similarity 21.2%; Pred. No. 4.1; 

Matches 55; Conservative 37; Mismatches 94; Indels 74; Gaps 10; 

Qy 3 DVDDAGDCSGARYNDWSDDDD DSNESKSIVWYPPWARIGTEAGTRARARARAR 55 

I I I : I I I I : : I I I :: I : II : I : 

Db 192 DEDDEEESSNADVEDNEDEEDNESESTSDSSDSDN SIDPEVAREKFQQLREQ 243 

Qy 56 ATRARRAVQK — RAS PNS DDTVLS PQELQK VLCLVEMSEKPYILEAAL 101 

:: ::| |: : I : |: | || : || :: | | 

Db 244 HSKTLAVI EKHGRSGKRAQDQIALLGEI FKQFRLWKQFDLLVLSMKEIMMKRVRYQERQL 303 

Qy 102 IALGNNAAYAFNRDI IRDLGGLP IVAKILNTRDPIVK E 139 

: I : I : I : I I I I I : I I : 

Db 304 QKILVDIAGMPKDDFEKIITTNGSNSEWVAKALKSSKPWAKRLIKYED 351 

Qy 140 KALIVLNNLSVNAENQRRLKVYMNQVCD DTI T S RLNS S VQLAGLRLLTNMTVTNE 194 

: I I I I : : I I : I : I I : I : I I I I : : : : : 

Db 352 RIYEALNNLAITEENTKLTITQMRDICDAVARGEQKARRAKKEMVEANLRLV--ISIAKK 409 

Qy 195 YQHMLAN SIS D FFRLFS AGN 214 

I I till 

Db 410 Y TNRGLQFLDLIQEGN 425 



RESULT 11 
F84471 

hypothetical protein At2g05810 [imported] - Arabidopsis thaliana 
C; Species: Arabidopsis thaliana (mouse-ear cress) 

C;Date: 02-Feb-2001 #sequence_revision 02-Feb-2001 #text_change 09-Jul-2004 
C;Accession: F84471 

R;Lin, X.; Kaul, S.; Rounsley, S.D.; Shea, T.P.; Benito, M.I.; Town, CD.; 
Fujii, C.Y.; Mason, T.M.; Bowman, C.L.; Barnstead, M.E.; Feldblyum, T.V.; Buell, 
C.R.; Ketchum, K.A. ; Lee, J.J.; Ronning, CM. ; Koo, H.; Moffat, K.S.; Cronin, 
L.A. ; Shen, M. ; VanAken, S.E.; Umayam, L.; Tallon, L.J.; Gill, J.E.; Adams, 
M.D.; Carrera, A.J.; Creasy, T.H.; Goodman, H.M. ; Somerville, C.R.; Copenhaver, 
G.P.; Preuss, D. ; Nierman, W.C; White, 0. ; Eisen, J. A. ; Salzberg, S.L.; Fraser, 
CM. ; Venter, J.C 
Nature 402, 761-768, 1999 

A; Title: Sequence and analysis of chromosome 2 of the plant Arabidopsis 
thaliana. 

A; Reference number: A84420; MUID : 20083487 ; PMID: 10617197 
A; Accession: F84471 
A; Status : preliminary 
A;Molecule type: DNA 
A; Residues: 1-580 <STO> 

A; Cross-references: UNIPROT :Q8S8G1; GB:AE002093; NID: g6598505 ; PIDN : AAF18619 . 1; 

GSPDB:GN00139 

C; Genetics : 

A;Gene: At2g05810 

A;Map position: 2 

Query Match 6.2%; Score 106.5; DB 2; Length 580; 

Best Local Similarity 18.6%; Pred. No. 4; 

Matches 56; Conservative 52; Mismatches 86; Indels 107; Gaps 8; 

Qy 67 ASPNSDDTVLSPQELQKVXCLVEMSEKPYILEAALI 102 

: I : I II I : I I : I I : I I : 

Db 232 SSADSRKTVFEQGGLGPLLRLLETGSSPFKTRAAIAIEAITADPATAWAI SAYGGVTVLI 291 

Qy 103 ALGNNAAYAFNRDI I RDLGGLP I VAKI LNT 132 

I : I I I I : : I : I : : :: | : 

Db 292 EACRSGSKQVQEHIAGAISNIAAVEEIRTTLAEEGAIPVXIQLLISGSSSVQEKTANFIS 351 

Qy 133 RDPIVKEKA LIVXNNLSWAENQRRLKVYMNQV-CDDTITSRLNSS- 177 

I I I I : I : Ml II: : : : I : : I : : I : I I 

Db 352 LISSSGEYYRDLIVRERGGLQILIHLVQESSNPDTIEHCLLALSQISAMETVSRVLSSST 411 

Qy 178 VQLAGLRLLTNMTVTNEYQHMLANSISDFFRLFS AGNEET 217 

: I | | : | : | : : : : : I : : I II I I : I 

Db 412 RFI I RLGELI KHGNVI LQQI ST SLLSNLT I SDGNKRAVADCLS S LI RLMES PKPAGLQEA 471 

Qy 218 KLQVXKLLLNLAENPAMTRELLRAQVPSSLGSLFNKKENKEVTLKLLVIFENINDNFKWE 277 

: I I I : I : I I : I :::::: | ::: | : : 

Db 472 ATEAAKSLLTVRSN RKELMR DEKSVT RLVQMLDPRNERMNNK 513 

Qy 278 E 278 

I 

Db 514 E 514 



RESULT 12 



A86416 

probable arm repeat-containing protein - Arabidopsis thaliana 
C; Species: Arabidopsis thaliana (mouse-ear cress) 

C;Date: 02-Mar-2001 #sequence_revision 02-Mar-2001 #text_change 09-Jul-2004 
C; Accession: A86416 

R;Theologis, A.; Ecker, J.R.; Palm, C.J.; Federspiel, N.A. ; Kaul, S.; White, 0. 
Alonso, J.; Altaf, H.; Araujo, R. ; Bowman, C.L.; Brooks, S.Y.; Buehler, £ . ; 
Chan, A.; Chao, Q. ; Chen, H.; Cheuk, R.F.; Chin, C.W.; Chung, M.K.; Conn, L. ; 
Conway, A.B.; Conway, A.R.; Creasy, T.H.; Dewar, K. ; Dunn, P.; Etgu, P.; 
Feldblyum, T.V. ; Feng, J.; Fong, B.; Fujii, C.Y.; Gill, J.E.; Goldsmith, A.D.; 
Haas, B. ; Hansen, N.F.; Hughes, B. ; Huizar, L. 
Nature 408, 816-820, 2000 

A; Authors: Hunter, J.L.; Jenkins, J.; Johnson-Hopson, C; Khan, S.; Khaykin, E. 
Kim, C.J.; Koo, H.L.; Kremenetskaia, I.; Kurtz, D.B.; Kwan, A.; Lam, B.; Langin 
Hooper, S.; Lee, A.; Lee, J.M. ; Lenz, C.A.; Li, J.H.; Li, Y. ; Lin, X.; Liu, 
S.X.; Liu, Z.A. ; Luros, J.S.; Maiti, R. ; Marziali, A.; Militscher, J.; Miranda, 
M. ; Nguyen, M. ; Nierman, W.C.; Osborne, B.I.; Pai, G. ; Peterson, J.; Pham, P.K. 
Rizzo, M. ; Rooney, T. ; Rowley, D.; Sakano, H. 

A;Authors: Salzberg, S.L.; Schwartz, J.R.; Shinn, P.; Southwick, A.M.; Sun, H.; 
Tallon, L.J.; Tambunga, G. ; Toriumi, M.J.; Town, CD.; Utterback, T.; van Aken, 
S.; Vaysberg, M. ; Vysotskaia, V.S.; Walker, M. ; Wu, D. ; Yu, G. ; Fraser, CM.; 
Venter, J.C; Davis, R.W. 

A;Title: Sequence and analysis of chromosome 1 of the plant Arabidopsis. 

A; Reference number: A86141; MUID: 21016719; PMID : 11130712 

A;Accession: A86416 

A; Status: preliminary 

A; Molecule type: DNA 

A; Residues: 1-729 <ST0> 

A; Cross-references: UNIPROT:Q9C7R6; GB:AE005172; NID: gl0092208 ; PIDN : AAG12624 . 1 

GSPDB:GN00141 

C; Genetics : 

A; Map position: 1 

Query Match 6.2%; Score 106.5; DB 2; Length 729; 

Best Local Similarity 23.0%; Pred. No. 5.4; 

Matches 51; Conservative 39; Mismatches 101; Indels 31; Gaps 7; 

Qy 68 SPNSDDTVXSPQELQKVLCLVEMSEKPYILEAALIALGNNAAYAF N 113 

fill I : II:: : : I I : I I I 

Db 386 SPNESFASALPTK AAVEAN KAT VS I L I K YLADG S QAAQT VAARE I RLLAKT GKEN 440 

Qy 114 RDIIRDLGGLPIVAKILNTRDPIVKEKALIVLNNLSWAENQRRLKVTMNQVCDDTITSR 173 

I I : I : I : : : I : : I : I : : : I I I : : I : I : I : : I I 

Db 441 RAYI AEAGAI PHLCRLLTS ENA1 AQENSVTAMLNLS I YEKNKS R — IMEEGDCLES I VSV 498 

Qy 174. LNSSV QLAGLRLLTNMTVTNEYQHMLA NS I S DFFRLFSAGNEETKLQVLKLLL 226 

II: I | : : : : | | : : | : | | | : | 

Db 499 LVSGLTVT^QENAAATLFSLSAVTIEYKKRIAIVTDQCvT^ 558 

Qy 227 NLAENPAMTRELLRAQVPSSL-GSLFNKKENKEV — ILKLLV 265 

I I : : I : : Nihil: : | I I I I 

Db 559 NLSTHPDNCSRMIEGGGVSSLVGALKNEGVAEEAAGALALLV 600 



RESULT 13 
T27763 

hypothetical protein ZK177.5 



- Caenorhabditis elegans 



C; Species: Caenorhabditis elegans 

C;Date: 15-Oct-1999 #sequence_revision 15-Oct-1999 #text_change 15-Oct-1999 
C; Accession: T27763 
R;Anderson, K. 

submitted to the EMBL Data Library, July 1995 

A; Description: The sequence of C. elegans cosmid ZK177. 

A; Reference number: Z20416 

A; Accession: T27763 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: DNA 
A; Residues: 1-428 <AND> 

A;Cross-references: EMBL:U21321; PIDN : AAB36969 . 1; GSPDB: GN00020; CESP:ZK177.5 

A; Experimental source: strain Bristol N2; clone ZK177 

C; Genetics : 

A; Gene: CESP:ZK177.5 

A; Map position: 2 

A;Introns: 55/1; 95/1; 133/2; 157/2; 232/2; 253/3; 338/2 

Query Match 6.1%; Score 105; DB 2; Length 428; 

Best Local Similarity 23.5%; Pred. No. 3.5; 

Matches 57; Conservative 35; Mismatches 81; Indels 70; Gaps 10; 

Qy 71 S DDTVLS PQELQKVLCLVEM SEKPYILEAALIALGNNAAYA FNRD 115 

I : I I : I I I I : I I : I : I : I I I : : I I : 

Db 166 SNDLVCHVADQQKRFGLVDMQKVAGRWSLESAGQILFEKSLGSLGNRSEWADGLIELNKK 225 

Qy 116 II RDL — GGLPIVAKILNTRDPIVKEKALIVLN — NLSVNAENQRRLKVYMNQVC 166 

I : I : I I I I I I I I : : : | | | : : | | | : : 

Db 226 I FQLSANKDMRFAS YLINRKELNRRD — VKTAPMLI YNLYNLATHPE ALKEIQKEIK 280 

Qy 167 DDTITSRLNSSVQLAGLRLLTNMTVTNEYQHMLANSISDFFRLFSAGNEETKLQVLKLLL 226 

: I : I : I I I : I I : |. I I : : : I : I 

Db 281 EDPASSKLT FLRACIKETFRMFPIGTEVSRVTQKNLIL 318 

Qy 227 NLAENPAMTRELLRAQVPSSLGSLFNKKENKEVILKLLVI FENINDNFK WEENEPTQ 283 

: I I I I : | | : : : | : | : | | I I 

Db 319 SGYEVPAGTAVDI NTNVLMRHEVLFSDSPREFKPQRWLEKSKEV 362 

Qy 284 NQF 286 

: I 

Db 363 HPF 365 



RESULT 14 
S50541 

hypothetical protein YER038c - yeast (Saccharomyces cerevisiae) 
C; Species: Saccharomyces cerevisiae 

C;Date: 28-May-1993 #sequence_revision 24-Feb-1995 #text_change 09-Jul-2004 
C; Accession: S50541 
R;Dietrich, F.S. 

submitted to the EMBL Data Library, December 1994 

A; Description: The sequence of S. cerevisiae cosmids 9379, 9581, and lambda 
clone 4678. 

A; Reference number: S50432 
A; Accession: S50541 
A;Molecule type: DNA 
A; Residues: 1-464 <DIE> 



A; Cross-references: UNIPROT : P40026; EMBL:U18796; NID:g603265; PID:g603271; 
GSPDB:GN00005; MIPS:YER038c 
C; Genetics : 

A;Gene: SGD: KRE29; MIPS:YER038c 
A; Cross-references: SGD:S0000840 
A; Map position: 5R 

C;Superfamily: Saccharomyces cerevisiae hypothetical protein YER038c 

Query Match 6.1%; Score 104; DB 2; Length 464; 

Best Local Similarity 18.8%; Pred. No. 4.6; 

Matches 68; Conservative 57; Mismatches 110; Indels 126; Gaps 15; 

Qy 20 DDDDDSNESKSIVWYPPWARIGTE AGT RARARARARAT RARRAVQKR 66 

I I I I I : : I : : : : I : I : I : I I 

Db 38 DDDDDEKVHPNFISDPENDSLNSDEEFSSLENSDLNLSGAKAESGDDFDPILKRTIISKR 97 

Qy 67 ASPNS DDTVLS PQELQKVLCLVEMS EKP YI LE1AALI ALGNNAAYAFNRDI I RDLGGL 123 

: I : : : : I : I : : I : I : I I : : I : I 
Db 98 KAPSNNEDEEIVKTPRKLVNYVPL KIFNLGD SFDDTI T 135 

Qy 124 PIVAKILNTRDPIV KEKALIVLNNLSVNAENQRRLK VYMNQVCDDTITS 172 

III: : : I: I:::: :| :| h :| :|:: I :||: 

Db 136 TTVAKLQDLKKEILDSPRSNKSIVITSNTVAKSELQKSIKFSGSIPEIYLDVVTKETISD 195 

Qy 173 RLNS SVQLAGLRL LTNMTVTNEYQHMLAN SIS DFFRL FS A 212 

: III: : I : I : I I : : I I 

Db 196 KYKDWHFISKNCHYEQLMDLEMKDTAYSFLFGSSRSQGKVPEFVHLKCPSITNLLVLFGV 255 

Qy 213 GNEETKLQVLKLLLNLAENPAMTRELLRAQVPSSLGSLFNKKENKEV IL 261 

I : : I I : I I I I : I 

Db 256 NQEKC NSLKINYEKKENSRYDNLCTIFPVNKML 288 

Qy 262 KLLVTFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC — ADKXL — GIESHHDFLVK 317 

I I : I : : I I | I I I I I : I I : : : I I I I 

Db 289 KFLMYFYS DDDNDDVRE FFLKAF-ICLILDRKVFNAMESDHRLCFK 333 

Qy 318 V 318 

I 

Db 334 V 334 



RESULT 15 
E82909 

conserved hypothetical UU292 [imported] - Ureaplasma urealyticum 
C; Species: Ureaplasma urealyticum 

C;Date: 18-Aug-2000 #sequence_revision 20-Aug-2000 #text^change 20-Aug-2000 
C; Accession: E82909 

R;Glass, J.I.; Lefkowitz, E.J.; Glass, J.S.; Heiner, C.R.; Chen, E.Y.; Cassell, 
G.H. 

submitted to GenBank, February 2000 

A; Description: The complete sequence of Ureaplasma urealyticum: Alternate views 

of a minimal genome and sexually transmitted pathogen. 

A; Reference number: A82870 

A; Accession: E82909 

A; Status: preliminary 

A; Molecule type: DNA 

A; Residues: 1-1830 <GLA> 



A;Cross-references: GB:AE002127; GB:AF222894; NID: g6899268 ; PIDN : AAF30701 . 1; 
GSPDB:GN00123; UUSP:UU292 

A; Experimental source: serovar 3; biovar 1 

C; Genetics : 

A; Gene: UU292 

A; Genetic code: SGC3 

Query Match 6.1%; Score 104; DB 2; Length 1830; 

Best Local Similarity 19.5%; Pred. No. 27; 

Matches 65; Conservative 52; Mismatches 96; Indels 120; Gaps 14; 

Qy 77 SPQELQKVLCLVEMSE KP Y — I LEAALI ALGNNAAYAFNRDI I RDLGGLP I VAKI LN 131 

:|::: I: |::::| : I I : I I : : I I I h : I 
Db 1400 TPEKVljKISSLLDINEIDARDYADIIEIILV^ IGF^^\/TSIQNNDWNN LK 1449 

Qy 132 TRDPIVKEKALIVLN N L S VN AENQ RRL KVYMNQVC D DT I T S RLN S S VQLAGLR- 184 

II : I : I I I : : : : : I : I : I I : : I I : I I I 

Db 1450 TIDKKTNDFATQILKKALNNGVSDQQLQKIKNVWYLLDDIVKIHDSSSFLRIQLEQLSH 1509 

Qy 185 LLTNMTVTNEYQHMLA NS 202 

I : : I : : : I : : II 
Db 1510 VLVDKITAVXPKSLTIKHKYTKIFSSILNNQDFLQKAKTLLSTILNELIDHKDKYKDINS 1569 

Qy 203 ISDFFRLFSAGN-EETKLQVXKLLLNLAENPAMTRELLRAQVPSSLGSLFNKKENKEVTL 261 

I : : I : | | : | | : : I : : : : I I I I I II 

Db 1570 FSELISVFFKNKASDLKTQLKDLLNTILKNQTLITNIGQVIIESF KLENKISIL 1623 

Qy 262 KL LVTFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVCA 302 

I : I I : I : I I I III: : 

Db 1624 DSDLEQNTI S FINKI FAHITELPI YTNLVDNF FNFFSEYTKSS 1666 

Qy 303 D-KXLGIESHHDFLVKVKVGKFMAKLAEfiMFPK 334 ' 

III I I : i : : I I 

Db 1667 DTKTLNF NKFKSSLFQAIIPK 1687 



Search completed: January 7, 200b, 14:52:28 
Job time : 21.1939 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2005 Compugen Ltd. 



OM protein - protein search, using sw model 



Run on: January 7, 2005, 14:51:07 ; Search time 63.3717 Seconds 

(without alignments) 
1917.457 Million cell updates/sec 

Title: US-10-726-721A-9 
Perfect score: 1715 

Sequence: 1 RGDVD DAGDC S GARYN DW S D VKVGKFMAKLAEHMFPKSQE 337 



Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 



Searched: 



1603904 seqs, 360571292 residues 



Total number of hits satisfying chosen parameters: 1603904 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



Database : Published_Applications_AA: * 

1 : / cgn2_6/ptodata/2/pubpaa/US07_PUBCOMB . pep : * 

2 : /cgn2_6/ptodata/2/pubpaa/PCT_NEW_PUB.pep:* 

3: /cgn2_6/ptodata/2/pubpaa/US06_NEW_PUB.pep:* 

4 : /cgn2_6/ptodata/2/pubpaa/US06_PUBCOMB.pep: * 

5 : / cgn2_6/ptodata/2/pubpaa/US07_NEW_PUB . pep : * 

6: /cgn2_6/ptodata/2/pubpaa/PCTUS_PUBCOMB.pep:* 

7: /cgn2_6/ptodata/2/pubpaa/US08_NEW_PUB.pep:* 

8 : /cgn2_6/ptodata/2/pubpaa/US08_PUBCOMB.pep:* 

9: /cgn2_6/ptodata/2/pubpaa/US09A_PUBCOMB.pep:* 

10: /cgn2_6/ptodata/2/pubpaa/US09B_PUBCOMB.pep:* 

11: /cgn2_6/ptodata/2/pubpaa/US09C_PUBCOMB.pep:* 

12 : /cgn2_6/ptodata/2/pubpaa/US09_NEW_PUB . pep : * 

13: /cgn2_6/ptodata/2/pubpaa/US10A_PUBCOMB.pep:* 

14: /cgn2_6/ptodata/2/pubpaa/US10B_PUBCOMB.pep:* 

15: /cgn2_6/ptodata/2/pubpaa/US10C_PUBCOMB.pep:* 

16: / cgn2_6/pt oda ta/2 /pubpaa/US 1 0D_PUBCOMB . pep : * 

17: /cgn2_6/ptodata/2/pubpaa/US10_NEW_PUB.pep:* 

18: / cgn2_6/p toda t a/ 2 /pubpaa/US 1 1_NEW_PUB . pep : * 

19: /cgn2_6/ptodata/2/pubpaa/US60_NEW_PUB.pep:* 

20: / cgn2_6/ptodata/2 /pubpaa/US 60_PUBCOMB. pep:* 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 

% 



Result Query 

No. Score Match Length DB ID 



Description 



1 


1713 


99. 


9 


337 


9 


US-09-780-996-9 


Sequence 


3, Appli 


2 


1713 


99. 


9 


337 


16 


us- 


10- 


726- 


721-9 


Sequence 


9, Appli 


3 


1708 


99. 


6 


342 


16 


us- 


10- 


408- 


765A-834 


Sequence 


834, 


App 


4 


1708 


99. 


6 


379 


14 


us- 


10- 


028- 


072-216 


Sequence 


216, 


App 


5 


1708 


99. 


6 


379 


14 


us- 


10- 


140- 


808-216 


Sequence 


216, 


App 


6 


1708 


99. 


6 


379 


14 


US- 


10- 


121- 


049-216 


Sequence 


216, 


App 


7 


1708 


99. 


6 


379 


14 


us- 


10- 


123- 


904-216 


Sequence 


216, 


App 


8 


1708 


99. 


6 


379 


14 


US- 


10- 


140- 


470-216 


Sequence 


216, 


App 


9 


1708 


99. 


6 


379 


14 


us- 


10- 


175- 


746-216 


Sequence 


216, 


App 


10 


1708 


99. 


6 


379 


14 


us- 


10- 


176- 


918-216 


Sequence 


216, 


App 


11 


1708 


99. 


6 


379 


14 


US- 


10- 


176- 


921-216 


Sequence 


216, 


App 


12 


1708 


99. 


6 


379 


14 


us- 


10- 


137- 


865-216 


Sequence 


216, 


App 


13 


1708 


99. 


6 


379 


14 


us- 


10- 


140- 


474-216 


Sequence 


216, 


App 


14 


1708 


99. 


6 


379 


14 


us- 


10- 


142- 


431-216 


Sequence 


216, 


App 


15 


1708 


99. 


6 


379 


14 


us- 


10- 


143- 


114-216 


Sequence 


216, 


App 


16 


1708 


99. 


6 


379 


14 


us- 


10- 


140- 


002-216 


Sequence 


216, 


App 


17 


1708 


99. 


6 


379 


14 


us- 


10- 


142- 


419-216 


Sequence 


216, 


App 


18 


1708 


99. 


6 


379 


14 


us- 


10- 


123- 


262-216 


Sequence 


216, 


App 


19 


1708 


99. 


6 


379 


14 


us- 


10- 


142- 


423-216 


Sequence 


216, 


App 


20 


1708 


99. 


6 


379 


14 


us- 


10- 


121- 


050-216 


Sequence 


216, 


App 


21 


1708 


99. 


6 


379 


14 


us- 


10- 


141- 


755-216 


Sequence 


216, 


App 


22 


1708 


99. 


6 


379 


14 


us- 


10- 


143- 


032-216 


Sequence 


216, 


App 


23 


1708 


99. 


6 


379 


14 


us- 


10- 


123- 


108-216 


Sequence 


216, 


App 


24 


1708 


99. 


6 


379 


14 


us- 


10- 


123- 


236-216 


Sequence 


216, 


App 


25 


1708 


99. 


6 


379 


14 


us- 


10- 


123- 


261-216 


Sequence 


216, 


App 


26 


1708 


99. 


6 


379 


14 


us- 


10- 


140- 


921-216 


Sequence 


216, 


App 


27 


1708 


99. 


6 


379 


14 


us- 


10- 


140- 


928-216 


Sequence 


216, 


App 


28 


1708 


99. 


6 


379 


14 


us- 


10- 


121- 


045-216 


Sequence 


216, 


App 


29 


1708 


99. 


6 


379 


14 


us- 


10- 


123- 


292-216 


Sequence 


216, 


App 


30 


1708 


99. 


6 


379 


14 


us- 


10- 


123- 


903-216 


Sequence 


216, 


App 


31 


1708 


99. 


6 


379 


14 


us- 


10- 


124- 


819-216 


Sequence 


216, 


App 


32 


1708 


99. 


6 


379 


14 


us- 


10- 


124- 


822-216 


Sequence 


216, 


App 


33 


1708 


99. 


6 


379 


14 


us- 


10- 


140- 


925-216 


Sequence 


216, 


App 


34 


1708 


99. 


6 


379 


14 


us- 


10- 


160- 


498-216 


Sequence 


216, 


App 


35 


1708 


99. 


6 


379 


14 


us- 


10- 


124- 


824-216 


Sequence 


216, 


App 


36 


1708 


99. 


6 


379 


14 


us- 


10- 


127- 


825A-216 


Sequence 


216, 


App 


37 


1708 


99. 


6 


379 


14 


us- 


10- 


127- 


829A-216 


Sequence 


216, 


App 


38 


1708 


99. 


6 


379 


14 


us- 


10- 


127- 


835A-216 


Sequence 


216, 


App 


39 


1708 


99. 


6 


379 


14 


us- 


10- 


127- 


839A-216 


Sequence 


216, 


App 


40 


1708 


99. 


6 


379 


14 


us- 


10- 


127- 


901A-216 


Sequence 


216, 


App 


41 


1708 


99. 


6 


379 


14 


us- 


10- 


128- 


693A-216 


Sequence 


216, 


App 


42 


1708 


99. 


6 


379 


14 


us- 


10- 


131- 


813A-216 


Sequence 


216, 


App 


43 


1708... 


99. 


6 


379 


14 


us- 


10- 


131- 


818A-216 


Sequence 


216, 


App 


44 


1708 


99. 


6 


379 


14 


us- 


10- 


131- 


823A-216 


Sequence 


216, 


App 


45 


1708 


99. 


6 


379 


14 


us- 


10- 


131- 


824A-216 


Sequence 


216, 


App 



ALIGNMENTS 



RESULT 1 
US-09-780-996-9 

; Sequence 9, Application US/09780996 
; Patent No. US20020061553A1 



; GENERAL INFORMATION: 
; APPLICANT: Maury, Isabella 
; APPLICANT: Mercken, Luc 
; APPLICANT: Foumier, Alain 

TITLE OF INVENTION: Partners of the PTB1 Domain of FE65, Preparation and Uses 
; FILE REFERENCE: ST00004-US 

; CURRENT APPLICATION NUMBER: US/09/780,996 

; CURRENT FILING DATE: 2001-02-09 

; PRIOR APPLICATION NUMBER: FR00/01628 

; PRIOR FILING DATE: 2000-02-10 

; PRIOR APPLICATION NUMBER: US 60/198,500 

; PRIOR FILING DATE: 2000-04-18 

; NUMBER OF SEQ ID NOS : 9 

; SOFTWARE: Patentln version 3.0 

; SEQ ID NO 9 

LENGTH: 337 

TYPE: PRT 

ORGANISM: Homo sapiens 

FEATURE : 
; NAME/KEY: misc_f eature 
; OTHER INFORMATION: X=G, D, V, or A 
US-09-780-996-9 

Query Match 99.9%; Score 1713; DB 9; Length 337; 

Best Local Similarity 100.0%; Pred. No. 4.6e-157; 

Matches 337; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 RGDWDAGDCSGARYNDWSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRAR 60 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 
Db 1 RGDVT)DAGDCSGARYNDWSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRAR 60 

Qy 61 RAVQKRAS PNS DDTVLS PQELQKVLCLVEMS EKPYI LEAALI ALGNNAAYAFNRDI I RDL 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 RAVQKRASPNSDDTVTjSPQELQKVXCLV^S 120 

Qy 121 GGLPIVAKILNTRDPIV^EKALIVXNNLSWAENQRRLKVTMNQVC 180 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 GGLPIVAKILNTRDPIV1CEKALIVLNNLSWAENQRRLKVYMNQVCDDTITSRLNSSVQL 180 

Qy 181 AGLRLLTNMTVTNEYQHMLANSISDFFRLFSAGNEETKLQVXK^ 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 AGLRLLTNMTWNEYQHMLANSISDFFRLFSAGNEETKLQVLKLLLNLAENPAMTRELLR 240 

Qy 241 AQVPSSLGSLFNKKENKEVILKLLVIFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQV 300 

I I I I I I I I I I I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 AQVPSSLGSLFNKKENKEVILKLLVI FENINDNFKWEENEPTQNQFGEGSLFFFLKEFQV 300 

Qy 301 CADKXLGI ESHHDFLVKVKVGKFMAKLAEHMFPKSQE 337 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I 
Db 301 CADKXLGI ESHHDFLVKVKVGKFMAKLAEHMFPKSQE 337 



RESULT 2 
US-10-726-721-9 

; Sequence 9, Application US/10726721 
; Publication No. US20040166109A1 
; GENERAL INFORMATION: 



; APPLICANT: Maury, Isabella 
; APPLICANT: Mercken, Luc 
; APPLICANT: Fournier, Alain 

; TITLE OF INVENTION: Partners of the PTB1 Domain of FE65, Preparation and Us 
; FILE REFERENCE: ST00004-US 

; CURRENT APPLICATION NUMBER: US/10/726, 721 

; CURRENT FILING DATE: 2003-12-03 

; PRIOR APPLICATION NUMBER: US/09/780, 996A 

; PRIOR FILING DATE: 2001-02-09 

; PRIOR APPLICATION NUMBER: FR00/01628 

; PRIOR FILING DATE: 2000-02-10 

; PRIOR APPLICATION NUMBER: US 60/198,500 

; PRIOR FILING DATE: 2000-04-18 

; NUMBER OF SEQ ID NOS : 11 

; SOFTWARE: Patent In version 3.2 

; SEQ ID NO 9 

LENGTH: 337 

TYPE: PRT 
; ORGANISM: Homo sapiens 

FEATURE: 

NAME/KEY: misc_f eature 
; OTHER INFORMATION: X=G, D, V, or A 
FEATURE: 

NAME/ KEY: mi sc_f eature 

LOCATION: (305) . . (305) 
; OTHER INFORMATION: Xaa can be any naturally occurring amino acid 
US-10-726-721-9 

Query Match 99.9%; Score 1713; DB 16; Length 337; 

Best Local Similarity 100.0%; Pred. No. 4.6e-157; 

Matches 337; Conservative 0; Mismatches 0; Indels 0; Gaps 0 



Qy 1 RGDVDDAGDCSGARYNDWSDDDDDSNESKSIVWYPPWARI GTEAGTRARARARARATRAR 60 

I I I I I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I 
Db 1 RGDVT)DAGDCSGARYNDWSDDDDDSNESKSIWYPPWARI GTEAGTRARARARARATRAR 60 

Qy 61 RAVQKRAS PNS DDTVLS PQELQKVXCLVEMSEKP YI LEAALI ALGNNAAYAFNRDI I RDL 120 

I 1. 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 RAVQKRASPNSDDTVXSPQELQKVXCLVTMSEKPYILEAAL^ 120 

Qy 121 GGLPIVAKILNTRDPIVl^EKALIVXNNLSWAENQRR 180 

I II I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 
Db 121 GGLPIVAKILNTRDPIVT<EKALI VXNNLSWAENQRRLK 180 

Qy 181 AGLRLLTNMTWNEYQHMLANSISDFFRLFSAGNEETKLQX^K^ 240 

I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I III I I I I I II I I I I I I I I I I 

Db 181 AGLRLLTNMTWNEYQHMLANSISDFFRLFSAGNEETKLQVLKLLLNLAENPAMTRELLR 240 

Qy 241 AQVPSSLGSLFNKKENKEVILKLLVIFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQV 300 

I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 
Db 241 AQVPSSLGSLFNKKENKEVTLKLLVT FENINDNFKWEENEPTQNQFGEGSLFFFLKEFQV 300 

Qy 301 CADKXLGIESHHDFLVKVKVGKFMAKLAEHMFPKSQE 337 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 
Db 301 CADKXLGI ESHHDFLVKVKVGKFMAKLAEHMFPKSQE 337 



RESULT 3 

US-10-408-765A-834 

Sequence 834, Application US/10408765A 
Publication No. US20040101874A1 
GENERAL INFORMATION: 
APPLICANT: Ghosh, Soumitra S. 
APPLICANT: Fahy, Eoin D. 
APPLICANT: Zhang, Bing 
APPLICANT: Gibson, Bradford W. 
APPLICANT: Taylor, Steven W. 
APPLICANT: Glenn, Gary M. 
APPLICANT: Warnock, Dale E. 

TITLE OF INVENTION: TARGETS FOR THERAPEUTIC INTERVENTION 
TITLE OF INVENTION: IDENTIFIED IN THE MITOCHONDRIAL PROTEOME 
FILE REFERENCE: 660088.465 

CURRENT APPLICATION NUMBER: US/10/408, 765A 
CURRENT FILING DATE: 2003-04-04 
NUMBER OF SEQ ID NOS : 3077 

SOFTWARE: Fast SEQ for Windows Version 4.0 
SEQ ID NO 834 
LENGTH: 342 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-10-408-765A-834 

Query Match 99.6%; Score 1708; DB 16; Length 342; 

Best Local Similarity 99.7%; Pred. No. 1.4e-156; 

Matches 335; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 2 GDVDDAGDCSGARYNDWSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRARR 61 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 7 GDVDDAGDCSGARYNDWSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRARR 66 

Qy 62 AVQKRAS PNSDDTVLS PQELQKVLCLVEMS EKP YI LEAALI ALGNNAAYAFNRDI I RDLG 121 

I I I I I I I I I I I I I I M I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 67 AVQKRASPNSDDTVXSPQELQKVXCLV^SEKPYILEAALIALGNNAAYAFNRDIIRDLG 126 

Qy 122 GLPIVAKILNTRDPIVlCEKALIvXNNLSWAENQRRLKvTMNQVCDDTITSRLNSSVQLA 181 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 127 GLPIVAKILNTRDPIVKEKALIVTjNNLSWAENQRRLKVY 186 

Qy 182 GLRLLTNMTVTNEYQHMLANS I SDFFRLFSAGNEETKLQVLKLLLNLAENPAMTRELLRA 241 

I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 187 GLRLLTNMTVTNEYQHMLANS I SDFFRLFSAGNEETKLQVLKLLLNLAENPAMTRELLRA 246 

Qy 242 QVPSSLGSLFNKKENKEVILKLLVI FENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 301 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 
Db 247 QVPSSLGSLFNKKENKEVTLKLLVIFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 306 

Qy 302 ADKXLGIESHHDFLVKVKVGKFMAKLAEHMFPKSQE 337 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 307 ADKVLGI ESHHDFLVKVKVGKFMAKLAEHMFPKSQE 342 



RESULT 4 

US-10-028-072-216 

; Sequence 216, Application US/10028072 



Publication No. US20030004311A1 
GENERAL INFORMATION: 

APPLICANT: Baker, Kevin P. 

APPLICANT : Beresini , Maureen 

APPLICANT: DeForge, Laura 

APPLICANT: Desnoyers , Luc 

APPLICANT: Filvaroff , Ellen 

APPLICANT : Gao , Wei-Qiang 

APPLICANT: Gerritsen,Mary E. 

APPLICANT : Goddard, Audrey 

APPLICANT: Godowski, Paul J. 

APPLICANT: Gurney, Austin L. 

APPLICANT : Sherwood, Steven 

APPLICANT: Smith, Victoria 

APPLICANT: Stewart, Timothy A. 

APPLICANT : Tumas , Daniel 

APPLICANT: Watanabe, Colin K 

APPLICANT : Wood, William 

APPLICANT: Zhang 

TITLE OF INVENTION: 

FILE REFERENCE: 

CURRENT APPLICATION NUMBER: US/10/028,072 
CURRENT FILING DATE: 2001-12-19 
PRIOR APPLICATION NUMBER: 60/049911 
PRIOR FILING DATE: 1997-06-18 
PRIOR APPLICATION NUMBER: 60/056974 
PRIOR FILING DATE: 1997-08-26 
PRIOR APPLICATION NUMBER: 60/059113 
PRIOR FILING DATE: 1997-09-17 
PRIOR APPLICATION NUMBER: 60/059115 
PRIOR FILING DATE: 1997-09-17 
PRIOR APPLICATION NUMBER: 60/059117 
PRIOR FILING DATE: 1997-09-17 
PRIOR APPLICATION NUMBER: 60/059122 
PRIOR FILING DATE: 1997-09-17 
PRIOR APPLICATION NUMBER: 60/059184 
PRIOR FILING DATE: 1997-09-17 
PRIOR APPLICATION NUMBER: 60/059263 
PRIOR FILING DATE: 1997-09-18 
PRIOR APPLICATION NUMBER: 60/059352 
PRIOR FILING DATE: 1997-09-19 
PRIOR APPLICATION NUMBER: 60/059588 
PRIOR FILING DATE: 1997-09-19 
PRIOR APPLICATION NUMBER: 60/059836 
PRIOR FILING DATE: 1997-09-24 
PRIOR APPLICATION NUMBER: 60/062250 
PRIOR FILING DATE: 1997-10-17 
PRIOR APPLICATION NUMBER: 60/062285 
PRIOR FILING DATE: 1997-10-17 
PRIOR APPLICATION NUMBER: 60/062287 
PRIOR FILING DATE: 1997-10-17 
PRIOR APPLICATION NUMBER: 60/062814 
PRIOR FILING DATE: 1997-10-24 
PRIOR APPLICATION NUMBER: 60/062816 
PRIOR FILING DATE: 1997-10-24 
PRIOR APPLICATION NUMBER: 60/063045 
PRIOR FILING DATE: 1997-10-24 



PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 



NUMBER: 60/063082 

1997-10-31 
NUMBER: 60/063127 

1997-10-24 
NUMBER: 60/063327 

1997-10-27 
NUMBER: 60/063329 

1997-10-27 
NUMBER: 60/063550 

1997-10-28 
NUMBER: 60/063561 

1997-10-28 
NUMBER: 60/063704 

1997-10-29 
NUMBER: 60/063733 

1997-10-29 
NUMBER: 60/063735 

1997-10-29 
NUMBER: 60/063738 

1997-10-29 
NUMBER: 60/063755 

1997-10-17 
NUMBER: 60/064248 

1997-11-03 
NUMBER: 60/064809 

1997-11-07 
NUMBER: 60/065186 

1997-11-12 
NUMBER: 60/065846 

1997-11-17 
NUMBER: 60/066364 

1997-11-21 
NUMBER: 60/066453 

1997-11-24 
NUMBER: 60/066511 

1997-11-24 
NUMBER: 60/066770 

1997-11-24 
NUMBER: 60/069212 

1997-12-11 
NUMBER: 60/069278 

1997-12-11 
NUMBER: 60/069334 

1997-12-11 
NUMBER: 60/069694 

1997- 12-16 
NUMBER: 60/072320 

1998- 01-23 
NUMBER: 60/073612 

1998-02-04 
NUMBER: 60/074086 

1998-02-09 
NUMBER: 60/074092 

1998-02-09 
NUMBER: 60/077791 

1998-03-12 
NUMBER: 60/078910 



PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 
PRIOR APPLICATION 
PRIOR FILING DATE: 



1998-03-20 
NUMBER: 60/079294 

1998-03-25 
NUMBER: 60/079663 

1998-02-27 
NUMBER: 60/079728 

1998-03-27 
NUMBER: 60/080165 

1998-03-31 
NUMBER: 60/081203 

1998-04-09 
NUMBER: 60/081229 

1998-04-09 
NUMBER: 60/081695 

1998-04-14 
NUMBER: 60/081817 

1998-04-15 
NUMBER: 60/081818 

1998-04-15 
NUMBER: 60/082999 

1998-04-24 
NUMBER: 60/083322 

1998-04-28 
NUMBER: 60/083545 

1998-04-29 
NUMBER: 60/084600 

1998-05-07 
NUMBER: 60/084627 

1998-05-07 
NUMBER: 60/084637 

1998-05-07 
NUMBER: 60/085149 

1998-05-12 
NUMBER: 60/085323 

1998-05-13 
NUMBER: 60/085338 

1998-05-13 
NUMBER: 60/085339 

1998-05-13 
NUMBER: 60/085579 

1998-05-15 
NUMBER: 60/085697 

1998-05-15 
NUMBER: 60/085704 

1998-05-15 
NUMBER: 60/086414 

1998-05-22 
NUMBER: 60/086430 

1998-05-22 
NUMBER: 60/087106 

1998-05-28 
NUMBER: 60/088026 

1998-06-04 
NUMBER: 60/088730 

1998-06-10 
NUMBER: 60/088741 

1998-06-10 



PRIOR APPLICATION NUMBER: 60/088810 
PRIOR FILING DATE: 1998-06-10 
PRIOR APPLICATION NUMBER: 60/088858 
PRIOR FILING DATE: 19/98-06-11 
PRIOR APPLICATION NUMBER: 60/089532 
PRIOR FILING DATE: 1998-06-17 
PRIOR APPLICATION NUMBER: 60/089599 
PRIOR FILING DATE: 1998-06-17 
PRIOR APPLICATION NUMBER: 60/089907 
PRIOR FILING DATE: 1998-06-18 
PRIOR APPLICATION NUMBER: 60/089947 
PRIOR FILING DATE: 1998-06-19 
PRIOR APPLICATION NUMBER: 60/090349 
PRIOR FILING DATE: 1998-06-23 
PRIOR APPLICATION NUMBER: 60/090429 
PRIOR FILING DATE: 1998-06-24 
PRIOR APPLICATION NUMBER: 60/090445 
PRIOR FILING DATE: 1998-06-24 
PRIOR APPLICATION NUMBER: 60/090538 
PRIOR FILING DATE: 1998-06-24 
PRIOR APPLICATION NUMBER: 60/090863 
PRIOR FILING DATE: 1998-06-26 
PRIOR APPLICATION NUMBER: 60/091360 
PRIOR FILING DATE: 1998-07-01 
PRIOR APPLICATION NUMBER: 60/091519 
PRIOR FILING DATE: 1998-07-02 
PRIOR APPLICATION NUMBER: 60/091982 
PRIOR FILING DATE: 1998-07-07 

Query Match 99.6%; Score 1708; DB 14; Length 379; 

Best Local Similarity 99.7%; Pred. No. 1.7e-156; 

Matches 335; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 



Qy 2 GDVTDDAGDCSGARYNDWSDDDDDSNESKSIWYPPWARIGTEAGTRARARARARATRARR 61 

I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 
Db 44 GDVDDAGDCSGARYNDWSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRARR 103 

Qy 62 AVQKRASPNSDDTVT.SPQELQKVI.CLV™^ 121 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 104 AVQKRASPNSDDTVXSPQELQKVLCLVEMSEKPYILEAALIALGNNAAYAFNRDIIRDLG 163 

Qy 122 GLPIVAKILNTRDPIWEKALIVTiNNLSWAENQRRLKVYMNQVCDDTITSRLNSSVQ^ 181 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I 
Db 164 GLPIVAKILNTRDPIV^EKALIVXNNLSWAENQRRLKVYMNQVCDDTITSRLNSSVQ^ 223 

Qy .182 GLRLLTNMT VTNEYQHMLANS I SDFFRLFSAGNEETKLQVLKLLLNLAENPAMTRELLRA 241 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 224 GLRLLTNMTVTNEYQHMLANS I SDFFRLFSAGNEETKLQVLKLLLNLAENPAMTRELLRA 283 

Qy 242 QVPSSLGSLFNKKENKEVILKLLVIFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 301 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I 
Db 284 QVPSSLGSLFNKKENKEVILKLLVIFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 343 

Qy 302 ADKXLGIESHHDFLVKVKVGKFMAKLAEHMFPKSQE 337 

Ml I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 344 ADKVXGIESHHDFLVWKVGKFMAKLAEHMFPKSQE 379 



RESULT 5 

US-10-140-808-216 

Sequence 216, Application US/10140808 
Publication No. US20030017563A1 
GENERAL INFORMATION: 
APPLICANT: Baker, Kevin P. 
APPLICANT : Beresini , Maureen 
APPLICANT : DeForge, Laura 
APPLICANT : Desnoyers , Luc 
APPLICANT : Filvarof f , Ellen 
APPLICANT : Gao, Wei-Qiang 
APPLICANT: Gerritsen, Mary E. 
APPLICANT : Goddard, Audrey 
APPLICANT: Godowski, Paul J. 
APPLICANT: Gurney, Austin L. 
APPLICANT : Sherwood, Steven 
APPLICANT: Smith, Victoria 
APPLICANT: Stewart , Timothy A. 
APPLICANT: Tumas, Daniel 
APPLICANT: Watanabe, Colin K 
APPLICANT: Wood, William 
APPLICANT: Zhang, Zemin 

TITLE OF INVENTION: SECRETED AND TRANSMEMBRANE POLYPEPTIDES AND NUCLEIC 
TITLE OF INVENTION: ACIDS ENCODING THE SAME 
FILE REFERENCE: P3330R1C182 
CURRENT APPLICATION NUMBER: US/10/140,808 
CURRENT FILING DATE: 2002-05-07 

Prior Apploication removed - See File Wrapper or Palm 
NUMBER OF SEQ ID NOS : 550 
SEQ ID NO 216 
LENGTH: 379 
TYPE: PRT 

ORGANISM: Homo Sapien 
US-10-140-808-216 

Query Match 99.6%; Score 1708; DB 14; Length 379; 

Best Local Similarity 99.7%; Pred. No. 1.7e-156; 

Matches 335; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 2 GDVDDAGDCSGARYNDWSDDDDDSNESKSIWYPPWARIGTEAGTRARARARARATRARR 61 

I I I I I I I I I I t I I I I I I I II I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 44 GDVDDAGDCSGARYNDWSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRARR 103 

Qy 62 AVQKRAS PNS DDTVLS PQELQKVLCLVEMS EKP YI LEAALI ALGNNAAYAFNRDI I RDLG 121 

I I I II I II I I I I I I I I I I I II I I I I I I I I I M I I I I I M I I I I I I I I I I I I I I I I I I I I I 
Db 104 AVQKRAS PNS DDTVLS PQELQKVLCLVEMS EKP YI LEAALI ALGNNAAYAFNRDI I RDLG 163 

Qy 122 GLPIVAKILNTRDPIWEKALIVXNNLSWAENQRR 181 

I I I I I I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 164 GLPIVAKILNTRDPIVl<EKALIvXNNLSWAENQRRLKvTMNQVCDDTITSRLNSSVQLA 223 

Qy 182 GLRLLTNMTWNEYQHMLANSISDFFRLFSAGNEETKLQVLKLLLNLAENPAMTRELLRA 241 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I M 
Db 224 GLRLLTNMTWNEYQHMLANSISDFFRLFSAGNEETKLQVLKLLLNLAENPAMTRELLRA 283 



Qy 



242 



QVPSSLGSLFNKKENKEVILKLLVIFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 301 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 284 QVPSSLGSLFNKKENKEVTLKLLVIFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 343 

Qy 302 ADKXLGI ESHHDFLVKVKVGKFMAKLAEHMFPKSQE 337 

III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 344 ADKVLGI ESHHDFLVKVKVGKFMAXLAEHMFPKSQE 379 



RESULT 6 

US-10-121-049-216 

Sequence 216, Application US/10121049 
Publication No. US20030022239A1 
GENERAL INFORMATION: 
APPLICANT: Baker, Kevin P. 
APPLICANT : Beres ini , Maureen 
APPLICANT: DeForge, Laura 
APPLICANT : Desnoyers , Luc 
APPLICANT : Filvarof f , Ellen 
APPLICANT : Gao, Wei-Qiang 
APPLICANT: Gerritsen, Mary E. 
APPLICANT: Goddard, Audrey 
APPLICANT: Godowski, Paul J. 
APPLICANT: Gurney, Austin L. 
APPLICANT : Sherwood, Steven 
APPLICANT: Smith, Victoria 
APPLICANT: Stewart , Timothy A. 
APPLICANT : Tumas , Daniel 
APPLICANT: Watanabe, Colin K 
APPLICANT : Wood, William 
APPLICANT: Zhang, Zemin 

TITLE OF INVENTION: SECRETED AND TRANSMEMBRANE POLYPEPTIDES AND NUCLEIC 
TITLE OF INVENTION: ACIDS ENCODING THE SAME 
FILE REFERENCE: P3330R1C17 

CURRENT APPLICATION NUMBER: US/10/121,04 9 
CURRENT FILING DATE: 2002-04-12 

Prior Application removed - See File Wrapper or Palm 
NUMBER OF SEQ ID NOS : 550 
SEQ ID NO 216 
LENGTH: 379 
TYPE: PRT 

ORGANISM: Homo Sapien 
US-10-121-049-216 



Query Match 99.6%; 
Best Local Similarity 99.7%; 
Matches 335; Conservative 



Score 1708; DB 14; Length 379; 
Pred. No. 1.7e-156; 
0; Mismatches 1; . .Indels 0; Gaps 



0; 



Qy 

Db 



2 GDVDDAGDCSGARYNDWSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRARR 61 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I I 
44 GDVDDAGDCSGARYNDWSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRARR 103 



Qy 

Db 



62 AVQKRAS PNS DDTVLS PQELQKVLCLVEMS EKP YI LEAALI ALGNNAAYAFNRDI I RDLG 121 
I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
104 AVQKRAS PNS DDTVLS PQELQKVLCLVEMS EKP Y I LEAALI ALGNNAAYAFNRDI I RDLG 163 



Qy 



122 



GLPIVAKILNTRDPIVXEKALIVXNNLSWAENQRRLKVmNQVCDDTITSRLNSSVQ 181 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 



Db 


164 


GLPIVAKILNTRDPIVKEKALIVLNNLSWAENQRRLKVYMNQVCDDTITSRLNSSVQLA 223 


Qy 


TOO 

182 


/"»T T\T T m X TX IIIH HtllTPVAtrK/T 71 XT f T O T*l T"~» r™1 T"\ T T">r» 7k V T T~» T~l m Y-fT /"Nt FT T^T T T XT T 71 nil T"\ TV limn T T TMl 4 1 

GLRLLTNMTVTNEYQHMLANSISDFFRLFSAGNEETKLQVLKLLIJtfLAENPi^^ 241 






I I I I i I I I i I i I I i I i > I i i I i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 


Db 


224 


GLRLLTNMTVTNEYQHMLANSISDFFRLFSA 283 


Qy 


242 


QVPSSLGSLFNKKENKEVTLKLLVTFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 301 






i i i i i i i i i i i t i i i i i i i i i i i i i i i i i t i i i i i i i i i i i i i i i i i i i i i i i i i i i • i i 

1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 


Db 


284 


QVPSSLGSLFNKKENKEVTLKLLVIFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 343 


Qy 


302 


ADKXLGIESHHDFLVKVKVGKFMAKLAEHMFPKSQE 337 






III 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 


Db 


344 


ADKVLGIESHHDFLVKVKVGKFMAKLAEHMFPKSQE 379 



RESULT 7 

US-10-123-904-216 

Sequence 216, Application US/10123904 
Publication No. US20030022328A1 
GENERAL INFORMATION: 
APPLICANT: Baker, Kevin P. 

Beresini, Maureen 
DeForge, Laura 
Desnoyers, Luc 
Filvarof f , Ellen 
Gao, Wei-Qiang 
Gerritsen, Mary E. 
Goddard, Audrey 
Godows ki , Paul J . 
Gurney, Austin L. 
Sherwood, Steven 
Smith, Victoria 
Stewart, Timothy A. 
Tumas, Daniel 
Watanabe, Colin K 
Wood, William 



APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
APPLICANT: 
TITLE OF INVENTION: 
TITLE OF INVENTION: 



Zhang, Zemin 

SECRETED AND TRANSMEMBRANE POLYPEPTIDES AND NUCLEIC 
ACIDS ENCODING THE SAME 



FILE REFERENCE: P3330R1C54 

CURRENT APPLICATION NUMBER: US/10/123,904 
CURRENT FILING DATE: 2002-04-16 

Prior Application removed - See File Wrapper or Palm 
NUMBER OF SEQ ID NOS : 550 
SEQ ID NO 216 
LENGTH: 379 
TYPE: PRT 

ORGANISM: Homo Sapien 
US-10-123-904-216 



Query Match 99.6%; 
Best Local Similarity 99.7%; 
Matches 335; Conservative 



Score 1708; DB 14; Length 379; 
Pred. No. 1.7e-156; 
0; Mismatches 1; Indels 0; 



Gaps . 0; 



Qy 

Db 



2 GDVDDAGDCSGARYNDWSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRARR 61 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 
44 GDVDDAGDCSGARYNDWSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRARR 103 



Qy 


62 


AVQKRAS PNS DDTVLS PQELQKVLCLVF^S EKP YI LEAALI ALGNNAAYAFNRDI I RDLG 


121 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I 1 




Db 


104 


AVQKRAS PNS DDTVLS PQELQKVLCLVEMS EKP YI LEAALI ALGNNAAYAFNRDI I RDLG 


163 


Qy 


122 


GLPIVAKILNTRDPIVKEKALIVLNNLSVNAENQRRLKVYMNQVCDDTITSR 


181 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 I 1 1 1 1 I 1 1 1 1 1 1 1 




Db 


164 


GLPIVAKILNTRDPIVKEKALIVLNNLSVNAENQRRLKVYMNQVCDDTITSRLNSSVQLA 223 


Qy 


182 


GLRLLTNMTWNEYQHMLANSISDFFRLFSAGNEETKLQVLKLLLNLAENPAMTRELLRA 241 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


224 


GLRLLTNMTWNEYQHMLANSISDFFRLFSAGNEETKLQVLKLLLNLAENPAMTRELLRA 283 


Qy 


242 


QVPSSLGSLFNKKENKEVILKLLVIFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 


301 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 




Db 


284 


QVPSSLGSLFNKKENKEVILKLLVIFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 


343 


Qy 


302 


ADKXLGI ESHHDFLVKVKVGKFMAKLAEHMFPKSQE 337 








III 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 




Db 


344 


ADKVLGI ESHHDFLVKVKVGKFMAKLAEHMFPKSQE 379 





RESULT 8 

US-10-140-470-216 

Sequence 216, Application US/10140470 
Publication No. US20030022331A1 
GENERAL INFORMATION: 
APPLICANT: Baker, Kevin P. 
APPLICANT : Beresini , Maureen 
APPLICANT : DeForge, Laura 
APPLICANT : Desnoyers , Luc 
APPLICANT : Filvarof f , Ellen 
APPLICANT : Gao, Wei-Qiang 
APPLICANT: Gerritsen, Mary E. 
APPLICANT : Goddard, Audrey 
APPLICANT: Godowski , Paul J. 
APPLICANT: Gurney, Austin L. 
APPLICANT : Sherwood, Steven 
APPLICANT : Smith, Victoria 
APPLICANT: Stewart , Timothy A. 
APPLICANT : Tumas , Daniel 
APPLICANT: Watanabe, Colin K 
APPLICANT : Wood, William 
APPLICANT: Zhang, Zemin 

TITLE OF INVENTION: SECRETED AND TRANSMEMBRANE POLYPEPTIDES AND NUCLEIC 
TITLE OF INVENTION: ACIDS ENCODING THE SAME 
FILE REFERENCE: P3330R1C160 
CURRENT APPLICATION NUMBER: US/10/140,470 
CURRENT FILING DATE: 2002-05-06 

Prior Application removed - See Palm or File Wrapper 
NUMBER OF SEQ ID NOS : 550 
SEQ ID NO 216 
LENGTH: 379 
TYPE: PRT 

ORGANISM: Homo Sapien 
US-10-140-470-216 



Query Match 99.6%; Score 1708; DB 14; Length 379; 

Best Local Similarity 99.7%; Pred. No. 1.7e-156; 

Matches 335; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 



Qy 2 GDVDDAGDCSGARYNDWSDDDDDSNESKSIWYPPWARIGTEAGTRARARARARATRARR 61 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 44 GDVDDAGDCSGARYNDWSDDDDDSNESKSIWYPPWARIGTEAGTRARARARARATRARR 103 

Qy 62 AVQKRASPNSDDTVXSPQELQKVXCLVEMSEKPYILEAALIALGNNAAYAFNRDIIRDL 121 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 104 AVQKRASPNSDDTVXSPQELQKVXCLV^SEKPYILEAALIALGNNAAYAFNRDIIRDLG 163 

Qy 122 GLPIVAKILNTRDPIVTCEKALIVXNNLSWAEN^ 181 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 164 GLPIVAKILNTRDPIVTCEKALIVIjNNLSWAENQ 223 

Qy 182 GLRLLTNMTWNEYQHMLANSISDFFRLFSAGNEETKLQVXKLLI^LAENPAMTRELL^ 241 

I I I I I I I I I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 224 GLRLLTNMTWNEYQHMLANSISDFFRLFSAGNEETKLQVXKLLLNLAENPAMTRE 283 

Qy 242 QVPSSLGSLFNKKENKEVILKLLVTFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 301 

I I I I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 284 QVPSSLGSLFNKKENKEVTLKLLVTFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 343 

Qy 302 ADKXLGI ESHHDFLVKVKVGKFMAKLAEHMFPKSQE 337 

III I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 344 ADKVLGI ESHHDFLVKVKVGKFMAKLAEHMFPKSQE 379 



RESULT 9 

US-10-175-746-216 

Sequence 216, Application US/10175746 
Publication No. US20030027270A1 
GENERAL INFORMATION: 
APPLICANT: Baker, Kevin P. 
APPLICANT: Beresini, Maureen 
APPLICANT: DeForge, Laura 
APPLICANT : Desnoyers , Luc 
APPLICANT : Filvarof f , Ellen 
APPLICANT: Gao, Wei-Qiang 
APPLICANT: Gerritsen, Mary E. 
APPLICANT: Goddard, Audrey 
APPLICANT: Godowski, Paul J. 
APPLICANT: Gurney, Austin L. 
APPLICANT : Sherwood, Steven 
APPLICANT: . Smith, Victoria 
APPLICANT: Stewart , Timothy A. 
APPLICANT : Tumas , Daniel 
APPLICANT: Watanabe, Colin K 
APPLICANT: Wood, William 
APPLICANT: Zhang, Zemin 

TITLE OF INVENTION: SECRETED AND TRANSMEMBRANE POLYPEPTIDES AND NUCLEIC 
TITLE OF INVENTION: ACIDS ENCODING THE SAME 
FILE REFERENCE: P3330R1C353 

CURRENT APPLICATION NUMBER: US/10/175, 746 
CURRENT FILING DATE: 2002-06-19 

Prior Application removed - See File Wrapper or Palm 



NUMBER OF SEQ ID NOS : 550 
SEQ ID NO 216 
LENGTH: 379 
TYPE: PRT 

ORGANISM: Homo Sapien 
US-10-175-746-216 

Query Match 99.6%; Score 1708; DB 14; Length 379; 

Best Local Similarity 99.7%; Pred. No. 1.7e-156; 

Matches 335; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 2 GDVDDAGDC S GAR YN DWSDDDDDSNESKS I VW Y P PWARI GT EAGT RARARARARAT RARR 61 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 44 G DVD DAGDCS GAR YN DWSDDDDDSNESKS I VWYP PWARI GTEAGTRARARARARAT RARR 103 

Qy 62 AVQKRAS PNSDDTVLSPQELQKVLCLVEMSEKP YI LEAALIALGNNAAYAFNRDI I RDLG 121 

I I I I I I I I I I I II I I I I I I I I I t I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 
Db 104 AVQKRAS PNSDDTVLSPQELQKVLCLVEMSEKP YI LEAALIALGNNAAYAFNRDI I RDLG 163 

Qy 122 GLPIVAKILNTRDPIWEKALIVTJtfNLSWAENQR 181 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 
Db 164 GLPIVAKILNTRDPIVl^EKALIVXNNLSWAENQRRLKWMNQVCDDTITSRLNSSVQLA 223 

Qy 182 GLRLLTNMTVTNEYQHMLANS I SDFFRLFSAGNEETKLQVLKLLLNLAENPAMTRELLRA 241 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 224 GLRLLTNMTVTNEYQHMLANS I SDFFRLFSAGNEETKLQVLKLLLNLAENPAMTRELLRA 283 

Qy 242 QVPSSLGSLFNKKENKEVILKLLVTFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 301 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 284 QVPSSLGSLFNKKENKEVILKLLVIFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 343 

Qy 302 ADKXLGIESHHDFLVKVKVGKFMAKLAEHMFPKSQE 337 

III I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 
Db 344 ADKVLGIESHHDFLVKVKVGKFMAKLAEHMFPKSQE 379 



RESULT 10 
US-10-176-918-216 

Sequence 216, Application US/10176918 
Publication No. US20030027275A1 
GENERAL INFORMATION: 
APPLICANT: Baker, Kevin P. 
APPLICANT: Beresini, Maureen 
APPLICANT: DeForge, Laura 
APPLICANT : Desnoyers , Luc 
APPLICANT: Filvaroff , Ellen 
APPLICANT: Gao, Wei-Qiang 
APPLICANT: Gerritsen,Mary E. 
APPLICANT : Goddard, Audrey 
APPLICANT: Godowski, Paul J. 
APPLICANT: Gurney, Austin L. 
APPLICANT : Sherwood, Steven 
APPLICANT: Smith, Victoria 
APPLICANT: Stewart, Timothy A. 
APPLICANT: Tumas, Daniel 
APPLICANT: Watanabe, Colin K 
APPLICANT: Wood, William 



APPLICANT: Zhang, Zemin 

TITLE OF INVENTION: SECRETED AND TRANSMEMBRANE POLYPEPTIDES AND NUCLEIC 
TITLE OF INVENTION: ACIDS ENCODING THE SAME 
FILE REFERENCE: P3330R1C382 
CURRENT APPLICATION NUMBER: US/10/176,918 
CURRENT FILING DATE: 2002-06-20 

Prior Application removed - See File Wrapper or Palm 
NUMBER OF SEQ ID NOS : 550 
SEQ ID NO 216 
LENGTH: 379 
TYPE: PRT 

ORGANISM: Homo Sapien 
US-10-176-918-216 

Query Match 99.6%; Score 1708; DB 14; Length 379; 

Best Local Similarity 99.7%; Pred. No. 1.7e-156; 

Matches 335; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 2 GDVT)DAGDCSGARYNDWSDDDDDSNESKSIWYPPWARIGTEAGTRARARARARATRARR 61 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 44 GDVDDAGDCSGARYNDWSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRARR 103 

Qy 62 AVQKRAS PNSDDTVLSPQELQKVLCLVEMSEKP YI LEAALIALGNNAAYAFNRDI IRDLG 121 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 104 AVQKRA5PNSDDTVLSPQELQKVLCLV^MSEKPYI LEAALIALGNNAAYAFNRDI IRDLG 163 

Qy 122 GLPIVAKILNTRDPIVKEKALIVXNNLSWAENQR 181 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 164 GLPIVAKILNTRDPIVT{EKALIvTjNNLSWAENQRRLKVT 223 

Qy 182 GLRLLTNMTVTNEYQHMLANSISDFFRLFSAGNEETKLQVTiKLLLNLAENPAMTRELLRA 241 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 224 GLRLLTNMTWNEYQHMLANSISDFFRLFSAGNEETKLQVLKLLLNLAENPAMTRELLRA 283 

Qy 242 QVPSSLGSLFNKKENKEVTLKLLVIFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 301 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 
Db 284 QVPSSLGSLFNKKENKEVTLKLLVIFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 343 

Qy 302 ADKXLGIESHHDFLVKVKVGKFMAKLAEHMFPKSQE 337 

III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 344 ADKVXGIESHHDFLVT<VKVGKFMAKLAEHMFPKSQE 379 



RESULT 11 
US-10-176-921-216 

Sequence 216, Application US/10176921 
Publication No. US20030027276A1 
GENERAL INFORMATION: 
APPLICANT: Baker, Kevin P. 
APPLICANT: Beresini , Maureen 
APPLICANT: DeForge, Laura 
APPLICANT: Desnoyers , Luc 
APPLICANT : Filvarof f , Ellen 
APPLICANT : Gao , Wei-Qiang 
APPLICANT: Gerritsen, Mary E. 
APPLICANT : Goddard, Audrey 
APPLICANT: Godowski, Paul J. 



APPLICANT: Gurney, Austin L. 
APPLICANT : Sherwood, Steven 
APPLICANT: Smith, Victoria 
APPLICANT: Stewart , Timothy A. 
APPLICANT : Tumas , Daniel 
APPLICANT: Watanabe, Colin K 
APPLICANT : Wood, William 
APPLICANT: Zhang, Zemin 

TITLE OF INVENTION: SECRETED AND TRANSMEMBRANE POLYPEPTIDES AND NUCLEIC 
TITLE OF INVENTION: ACIDS ENCODING THE SAME 
FILE REFERENCE: P3330R1C288 
CURRENT APPLICATION NUMBER: US/10/176,921 
CURRENT FILING DATE: 2002-06-20 

Prior Application removed - See File Wrapper or Palm 
NUMBER OF SEQ ID NOS : 550 
SEQ ID NO 216 
LENGTH: 379 
TYPE: PRT 

ORGANISM: Homo Sapien 
US-10-176-921-216 

Query Match 99.6%; Score 1708; DB 14; Length 379; 

Best Local Similarity 99.7%; Pred. No. 1.7e-156; 

Matches 335; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 2 GDVDDAGDC S GARYN DW SDDDDDSNESKS I VWYP PWARI GTEAGT RARARARARAT RARR 61 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 44 GDVDDAGDC S GARYN DW SDDDDDSNESKSI VWYP PWARI GTEAGT RARARARARAT RARR 103 

Qy 62 AVQKRAS PNSDDTVLS PQELQKVLCLVEMS EKP YI LEAALI ALGNNAAYAFNRDI I RDLG 121 

I I I 1 I I I I I I I I I I I II I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I II I I 
Db 104 AVQKRAS PNS DDTVLS PQELQKVLCLVEMS EKP YI LEIAAL I ALGNNAAYAFNRDI I RDLG 163 

Qy 122 GLPIVAKILNTRDPIVT<EKALIVXNNLSWAENQRR 181 

I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 164 GLPIVAKILNTRDPIVKEKALIVLNNLSWAENQRRLKVTMNQVCDDTITSRLNSSVQ^ 223 

Qy 182 GLRLLTNMTWNEYQHMLANSISDFFRLFSAGNEETKLQVTiKLLLNLAENPAMTRELLRA 241 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 224 GLRLLTNMTWNEYQHML7\NSISDFFRLFSAGNEETKLQVLKLLLNLAENPAMTRELLRA 283 

Qy 242 QVPSSLGSLFNKKENKEVILKLLVIFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 301 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 284 QVPSSLGSLFNKKENKEVILKLLVIFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 343 

Qy 302 ADKXLGIESHHDFLVKVKVGKFMAKLAEHMFPKSQE 337 

III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 344 ADKVLGIESHHDFLVKVKVGKFMAKLAEHMFPKSQE 379 



RESULT 12 
US-10-137-865-216 

; Sequence 216, Application US/10137865 

; Publication No. US20030032155A1 

; GENERAL INFORMATION: 

; APPLICANT: Baker, Kevin P. 

; APPLICANT: Beresini , Maureen 



APPLICANT : DeForge, Laura 
APPLICANT : Desnoyers , Luc 
APPLICANT : Filvarof f , Ellen 
APPLICANT: Gao, Wei-Qiang 
APPLICANT: Gerritsen,Mary E. 
APPLICANT : Goddard, Audrey 
APPLICANT: Godowski, Paul J. 
APPLICANT: Gurney, Austin L. 
APPLICANT : Sherwood, Steven 
APPLICANT : Smith, Victoria 
APPLICANT: Stewart , Timothy A. 
APPLICANT: Tumas, Daniel 
APPLICANT: Watanabe, Colin K 
APPLICANT : Wood, William 
APPLICANT: Zhang, Zemin 

TITLE OF INVENTION: SECRETED AND TRANSMEMBRANE POLYPEPTIDES AND NUCLEIC 
TITLE OF INVENTION: ACIDS ENCODING THE SAME 
FILE REFERENCE: P3330R1C154 
CURRENT APPLICATION NUMBER: US/10/137,865 
CURRENT FILING DATE: 2002-05-03 

Prior Application removed - See Palm or File Wrapper 
NUMBER OF SEQ ID NOS: 550 
SEQ ID NO 216 
LENGTH: 379 
TYPE: PRT 

ORGANISM: Homo Sapien 
US-10-137-865-216 

Query Match 99.6%; Score 1708; DB 14; Length 379; 

Best Local Similarity 99.7%; Pred. No. 1.7e-156; 

Matches 335; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 2 GDVDDAGDCSGARYNDWSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRARR 61 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 44 GDVDDAGD C S GARYN DWS DDDDDSNESKSI VW Y P PWARI GT EAGTRARARARARAT RARR 103 

Qy 62 AVQKRASPNSDDTV^SPQELQKVLCLV^SEKPYILEAALIALGNNAAYAFNRDIIRDLG 121 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 104 AVQKRASPNSDDTVLSPQELQKVXCLVEMSEKPYILEAALIALGNNAAYAFNRDIIRDLG 163 

Qy 122 GLPIVAKILNTRDPIVKEKALIVXNNLSWAENQRRLKWMNQVCDDTITSRLNSSVQLA 181 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 164 GLPIVAKILNTRDPIVl^EKALIVLNNLSWAENQRRLKVYMNQVCDDTITSRLNSSVQLA 223 

Qy 182 GLRLLTNMTWNEYQHMLANSISDFFRLFSAGNEETKLQVLKLLLNLAENPAMTRELLRA 241 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I ! I I t I I I I I I I I I I I M ! I 

Db 224 GLRLLTNMTWNEYQHMLANSISDFFRLFSAGNEETKLQVXKLLLNLAENPAMTRELLRA 283 

Qy 242 QVPSSLGSLFNKKENKEVILKLLVIFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 301 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 
Db 284 QVPSSLGSLFNKKENKEVILKLLVT FENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 343 

Qy 302 ADKXLGIESHHDFLVKVKVGKFMAKLAEHMFPKSQE 337 

III I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 
Db 344 ADKVLGI ESHHDFLVKVKVGKFMAKLAEHMFPKSQE 379 



RESULT 13 
US-10-140-474-216 

Sequence 216, Application US/10140474 
Publication No. US20030032156A1 
GENERAL INFORMATION: 
APPLICANT: Baker, Kevin P. 
APPLICANT : Beresini , Maureen 
APPLICANT : DeForge , Laura 
APPLICANT : Desnoyers , Luc 
APPLICANT : Filvarof f , Ellen 
APPLICANT: Gao, Wei-Qiang 
APPLICANT: Gerritsen,Mary E. 
APPLICANT: Goddard, Audrey 
APPLICANT: Godowski , Paul J. 
APPLICANT: Gurney, Austin L. 
APPLICANT : Sherwood, Steven 
APPLICANT: Smith, Victoria 
APPLICANT: Stewart, Timothy A. 
APPLICANT : Tumas , Daniel 
APPLICANT: Watanabe, Colin K 
APPLICANT: Wood, William 
APPLICANT: Zhang, Zemin 

TITLE OF INVENTION: SECRETED AND TRANSMEMBRANE POLYPEPTIDES AND NUCLEIC 
TITLE OF INVENTION: ACIDS ENCODING THE SAME 
FILE REFERENCE: P3330R1C162 
CURRENT APPLICATION NUMBER: US/10/140,474 
CURRENT FILING DATE: 2002-05-06 

Prior Application removed - See Palm or File Wrapper 
NUMBER OF SEQ ID NOS : 550 
SEQ ID NO 216 
LENGTH: 379 
TYPE: PRT 

ORGANISM: Homo Sapien 
US-10-140-474-216 

Query Match 99.6%; Score 1708; DB 14; Length 379; 

Best Local Similarity 99.7%; Pred. No. 1.7e-156; 

Matches 335; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 2 GDVDDAGDCSGARYNDWSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRARR 61 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I 
Db 44 GDVDDAGDCSGARYNDWSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRARR 103 

Qy 62 AVQKRAS PNS DDTVLS PQELQKVLCLVEMSEKP YI LEAALI ALGNNAAYAFNRDI I RDLG 121 

I I I I I I II I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 104 AVQKRAS PNS DDTVLS PQELQKVLCLVEMS EKPYI LEAALI ALGNNAAYAFNRDI I RDLG 163 

Qy 122 GLPIVAKILNTRDPIVTCEKALIVXNNLSW 181 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 164 GLPIVAKILNTRDPIVT^EKALIVXNNLSWAENQRRLKVYMNQVCDDTITSRLNSSVQ^ 223 

Qy 182 GLRLLTNMTVTNEYQHMLANSISDFFRLFSAGNEETKLQVLKLLLNLAENPAMTRELLRA 241 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 224 GLRLLTNMTVTNEYQHMLANSISDFFRLFSAGNEETKLQVLKLLLNLAENPAMTRELLRA 283 



Qy 



242 



QVPSSLGSLFNKKENKEVILKLLVIFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



301 



Db 284 QVPSSLGSLFNKKENKEVILKLLVIFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 343 

Qy 302 ADKXLGI ESHHDFLVKVKVGKFMAKLAEHMFPKSQE 337 

III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 344 ADKVLGI ESHHDFLVKVKVGKFMAKLAEHMFPKSQE 379 



RESULT 14 
US-10-142-431-216 

Sequence 216, Application US/10142431 
Publication No. US20030036179A1 
GENERAL INFORMATION: 
APPLICANT: Baker, Kevin P. 
APPLICANT : Ber es ini , Maureen 
APPLICANT: DeForge, Laura 
APPLICANT: Desnoyers , Luc 
APPLICANT: Filvarof f , Ellen 
APPLICANT : Gao, Wei-Qiang 
APPLICANT: Gerritsen,Mary E. 
APPLICANT: Goddard, Audrey 
APPLICANT: Godowski , Paul J. 
APPLICANT: Gurney, Austin L. 
APPLICANT : Sherwood, Steven 
APPLICANT: Smith, Victoria 
APPLICANT: Stewart , Timothy A. 
APPLICANT: Tumas, Daniel 
APPLICANT: Watanabe, Colin K 
APPLICANT : Wood, William 
APPLICANT: Zhang, Zemin 

TITLE OF INVENTION: SECRETED AND TRANSMEMBRANE POLYPEPTIDES AND NUCLEIC 
TITLE OF INVENTION: ACIDS ENCODING THE SAME 
FILE REFERENCE: P3330R1C251 
CURRENT APPLICATION NUMBER: US/ 10/ 142 , 431 
CURRENT FILING DATE: 2002-05-10 

Prior Application removed - See File Wrapper or Palm 
NUMBER OF SEQ ID NOS : 550 
SEQ ID NO 216 
LENGTH: 379 
TYPE: PRT 

ORGANISM: Homo Sapien 
US-10-142-431-216 



Query Match 99.6%; 
Best Local Similarity 99.7%; 
Matches 335; Conservative 



Score 1708; DB 14; 
Pred. No. 1.7e-156; 
0; Mismatches 1; 



Length 379; 



Indels 



0; Gaps 



0; 



Qy 

Db 



2 GDVTIDAGDCSGARYNDWSDDDDDSNESKSIWYPPWARIGTEAGTRARARARARATRARR 61 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 
44 GDVDDAGDCSGARYNDWSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRARR 103 



Qy 

Db 

Qy 

Db 



62 AVQKRAS PN S DDTVLS PQELQKVLCLVEMSEKP YI LEAALI ALGNNAAYAFNRDI I RDLG 121 

I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I 

104 AVQKRAS PNS DDTVLS PQELQKVLCLVEMS EKPYI LEAALI ALGNNAAYAFNRDI I RDLG 163 

122 GLPIVAKILNTRDPIV^EKALIVT.NNLSWAENQRRLKVirMNQVCDDTITSRLNSSVQ^ 181 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
164 GLPIVAKILNTRDPIVl^EKALIVLNNLSWAENQRRLKVYMNQVCDDTITSRLNSSVQLA 223 



Qy 182 GLRLLTN>TTWNEYQHMLANSISDFF 241 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 224 GLRLLTNMTVTNEYQHML^ 283 

Qy 242 QVPSSLGSLFNKKENKEVILKLLVTFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 301 

I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 
Db 284 QVPSSLGSLFNKKENKEVILKLLVTFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 343 

Qy 302 ADKXLGIESHHDFLVKVKVGKFMAKLAEHMFPKSQE 337 

III I II I I I I I I I I I I I I I I I I I I I II I I I II I I I 
Db 344 ADKVLGIESHHDFLVKVKVGKFMAKLAEHMFPKSQE 379 



RESULT 15 
US-10-143-114-216 

Sequence 216, Application US/10143114 
Publication No. US20030036180A1 
GENERAL INFORMATION: 
APPLICANT: Baker , Kevin P. 
APPLICANT : Beresini , Maureen 
APPLICANT: DeForge, Laura 
APPLICANT : Desnoyers , Luc 
APPLICANT : Filvarof f , Ellen 
APPLICANT : Gao , Wei-Qiang 
APPLICANT: Gerritsen, Mary E. 
APPLICANT: Goddard, Audrey 
APPLICANT: Godowski, Paul J. 
APPLICANT: Gurney, Austin L. 
APPLICANT: Sherwood, Steven 
APPLICANT: Smith, Victoria 
APPLICANT: Stewart, Timothy A. 
APPLICANT: Tumas, Daniel 
APPLICANT: Watanabe, Colin K 
APPLICANT: Wood, William 
APPLICANT: Zhang, Zemin 

TITLE OF INVENTION: SECRETED AND TRANSMEMBRANE POLYPEPTIDES AND NUCLEIC 
TITLE OF INVENTION: ACIDS ENCODING THE SAME 
FILE REFERENCE: P3330R1C211 

CURRENT APPLICATION NUMBER: US/10/143,114 
CURRENT FILING DATE: 2002-05-09 

Prior Application removed - See Palm or File Wrapper 
NUMBER OF SEQ ID NOS : 550 
SEQ ID NO 216 
LENGTH: 379 
TYPE: PRT 

ORGANISM: Homo Sapien 
US-10-143-114-216 



Query Match 99.6%; 
Best Local Similarity 99.7%; 
Matches 335; Conservative 



Score 1708; DB 14; 
Pred. No. 1.7e-156; 
0; Mismatches 1; 



Length 379; 



Indels 



0; Gaps 



0; 



Qy 

Db 



2 GDVDDAGDCSGARYNDWSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRARR 61 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
44 GDVDDAGDCSGARYNDWSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRARR 103 



Qy 62 AVQKRAS PNSDDTVLS PQELQKVLCLVEMSEKP YI LEAALIALGNNAAYAFNRDI I RDLG 121 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 104 AVQKRAS PNS DDTVLS PQELQKVLCLVEMSEKP YI LEAALI ALGNNAAYAFNRDI I RDLG 163 

Qy 122 GLPIVAKII^TRDPIVKEKALIVliNNLSWAENQRRLKVYMNQVCDDTITSRLNSSVQ^ 181 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 164 GLPIVAKILNTRDPIVKEKALIVLNNLSWAENQRRLKVYMNQVCDDTITSRLNSSVQLA 223 

Qy 182 GLRLLTNMTWNEYQHMLANSISDFFRLFSAGNEETKLQVLKLLLNLAENPAMTRELLRA 241 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I 
Db 224 GLRLLT^n4TWNEYQHMLANSISDFFRLFSAGNEE 283 

Qy 242 QVPSSLGSLFNKKENKEVTLKLLVIFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 301 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 
Db 284 QVPSSLGSLFNKKENKEVILKLLVTFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 343 

Qy 302 ADKXLGI ESHHDFLVKVKVGKFMAKLAEHMFPKSQE 337 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 
Db 344 ADKVLGI ESHHDFLVKVKVGKFMAKLAEHMFPKSQE 379 



Search completed: January 7, 2005, 15:01:14 
Job time : 64.3717 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2005 Compugen Ltd. 



OM protein - protein search, using sw model 



Run on: 



January 7, 2005, 12:37:55 



; Search time 69.2668 Seconds 

(without alignments) 

2799.340 Million cell updates/sec 



Title: 

Perfect score: 
Sequence: 



US-10-726-721A-9 
1715 

1 RGDVDDAGDCSGARYNDWSD. 



. VKVGKFMAKLAEHMFPKSQE 337 



Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 

Searched: 1825181 seqs, 575374646 residues 

Total number of hits satisfying chosen parameters: 1825181 



Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 

Database : UniProt_02:* 

1: uniprot_sprot : * 
2 : uniprot_trembl : * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 

% 

Result Query 

No. Score Match Length DB ID Description 



1 


1708 


99. 


6 


342 


2 


Q7L8L7 


Q71817 homo sapien 


2 


1708 


99. 


6 


379 


2 


Q9UH62 


Q9uh62 homo sapien 


3 


1708 


99. 


6 


379 


2 


AAQ89438 


Aaq89438 homo sapi 


4 


1645 


95. 


9 


379 


2 


Q8BHS6 


Q8bhs6 mus musculu 


5 


1645 


95. 


9 


379 


2 


Q91VP8 


Q91vp8 mus musculu 


6 


1625 


94. 


8 


379 


2 


Q9DC32 


Q9dc32 mus musculu 


7 


781.5 


45. 


6 


453 


2 


Q9P291 


Q9p291 homo sapien 


8 


766.5 


44. 


7 


456 


2 


Q9CX83 


Q9cx83 mus musculu 


9 


766.5 


44. 


7 


456 


2 


AAH68228 


Aah68228 mus muscu 


10 


683 


39. 


8 


632 


2 


Q7L311 


Q71311 homo sapien 


11 


683 


39. 


8 


710 


2 


060267 


O60267 homo sapien 


12 


683 


39. 


8 


710 


2 


BAA25438 


Baa254 38 homo sapi 


13 


671.5 


39. 


2 


308 


2 


Q9BTM6 


Q9btm6 homo sapien 


14 


670.5 


39. 


1 


308 


2 


Q8IZC1 


Q8izcl homo sapien 


15 


664 


38. 


7 


784 


2 


Q8BJ82 


Q8bj82 m mus muscu 



lb 




Q O 
OO 


. / 


o ^ o 


2 


QoNzfc b 


Q8n2f6 homo sapien 


1 / 


a £. o 
boo 


"3 o 

oo 


. / 


324 


2 


Qy lvzo 


Q91vz8 mus musculu 


lo 


bbo 


o o 
3o 


n 
. / 


TOO 

/22 


2 




Q8bte9 m mus muscu 


19 


<C /C o 

663 


38 


. 7 


784 


2 


Q8BJ81 


Q8bj81 mus musculu 


20 


663 


38 


. 7 


784 


2 


Q8BTE8 


Q8bte8 mus musculu 


21 


663 


38 


. 7 


784 


2 


Q9CXI9 


Q9cxi9 m mus muscu 


o o 

22 


641.5 


o o 

37 


. 4 


306 


2 


Q9CZ87 


Q9cz87 mus musculu 


o o 

23 


640.5 


37 


.3 


306 


2 


Q9D0L7 


Q9d017 mus musculu 


O VI 

24 


640 . 5 


37 


.3 


306 


2 


AAH58573 


Aah58573 mus muscu 


o c 
25 


64 0 . 5 


37 


.3 


n r\ tz 
306 


2 


71 1\tlOO A OO 

AAH3848 7 


Aah38487 mus muscu 


26 


623 . 5 


36 


.4 


388 


2 


Q9CUN3 


Q9cun3 mus musculu 


O O 
2 / 


5/5.5 


33.6 


n o c 

995 


2 


Q8K2R3 


Q8k2r3 mus musculu 


28 


573 . 5 


33 


. 4 


340 


2 


Q8R103 


Q8rl03 mus musculu 


29 


545.5 


31 


. 8 


212 


2 


Q75ML8 


Q75ml8 homo sapien 


30 


545.5 


31 


.8 


O 1 o 

212 


2 


AAS0 /531 


Aas07531 homo sapi 




543.5 


31 


.7 


o tz o 

36 / 


2 


Q9H2Q0 


Q9h2q0 homo sapien 


o o 

32 


/ICO 

452 


26 


.4 


249 


2 


Q8IZC3 


Q8izc3 homo sapien 


33 


445 


25 


.9 


284 


2 


Q8IZC2 


Q8izc2 homo sapien 


34 


389 


22 


.7 


283 


2 


Q9BVZ3 


Q9bvz3 homo sapien 


35 


389 


22 


.7 


558 


2 


Q6P1M9 


Q6plm9 homo sapien 


36 


389 


22 


.7 


558 


2 


Q9H969 


a*** r\ 1 r\ r\ 1 • 

Q9h969 homo sapien 


37 


389 


22 


.7 


558 


2 


AAH58904 


Aah58904 homo sapi 


38 


389 


22 


.7 


558 


2 


'AAH64983 


Aah64983 homo sapi 


39 


354 . 5 


20 


.7 


497 


2 


Q8R0B3 


Q8r0b3 mus musculu 


40 


o a it a 

346.5 


20 


.2 


300 


2 


Q7L4S7 


Q714s7 homo sapien 


A 1 




20 


.2 


inn 
JUL) 


o 


AAjiU / b / / 


Aanu/b/ / nomo sapi 


42 


346.5 


20 


.2 


321 


2 


Q9NTS2 


Q9nts2 homo sapien 


43 


345 


20 


.1 


1404 


2 


043168 


043168 homo sapien 


44 


330.5 


19 


.3 


236 


2 


Q9NWJ3 


Q9nwj3 homo sapien 


45 


328.5 


19 


.2 


301 


2 


Q6TEL9 


Q6tel9 brachydanio 



ALIGNMENTS 



RESULT 1 
Q7L8L7 

ID Q7L8L7 PRELIMINARY; PRT; 342 AA. 

AC Q7L8L7; 

DT 05-JUL-2004 (TrEMBLrel. 27, Created) 

DT 05-JUL-2004 (TrEMBLrel. 27, Last sequence update) 

DT 05-JUL-2004 (TrEMBLrel. 27, Last annotation update) 

DE DJ545K15.2 (ALEX 3 (Protein similar to KIAA0512 and KIAA0443) ) (BM- 

DE 017). 

GN Name=dJ545K15 . 2 ; 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Wilson S . ; 

RL Submitted (MAY-2000) to the EMBL/ GenBank/DDBJ databases. 

RN [2] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Bone marrow; 

RX MEDLINE=20499367; PubMed=11042152 ; 



RA Zhang Q.H., Ye M. , Wu X.Y., Ren S.X., Zhao M. , Zhao C.J., Fu G., 

RA Shen Y., Fan H.Y., Lu G., Zhong M. , Xu X.R., Han Z.G., Zhang J.W., 

RA Tao J., Huang Q.H., Zhou J., Hu G.X., Gu J. , Chen S.J., Chen Z.; 

RT "Cloning and functional analysis of cDNAs with open reading frames for 

RT 300 previously undefined genes expressed in CD34+ hematopoietic 

RT stem/progenitor cells. 11 ; 

RL Genome Res. 10:1546-1560(2000). 

DR EMBL; AL121883; CAB92763.1; 

DR EMBL; AF208859; AAF64273.1; -. 

DR InterPro; IPR008938; ARM. 

DR InterPro; IPR000225; Armadillo. 

DR InterPro; IPR006911; DUF634 . 

DR Pfam; PF04826; DUF634; 1. 

DR PROSITE; PS50176; ARM_REPEAT; 1. 

SQ SEQUENCE 342 AA; 38399 MW; 6158B311FE7CB5FB CRC64; 



Query Match 99.6%; Score 1708; DB 2; Length 342; 

Best Local Similarity 99.7%; Pred. No. 3.9e-118; 

Matches 335; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 



Qy 


2 


GDVDDAGDCSGARYNDWS DDDDDSNES KS I VWYPPWARI GTEAGTRARARARARATRARR 


61 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 




Db 


7 


GDVDDAGDCSGARYNDWSDDDDDSNESKSIVWYPPWARI GTEAGTRARARARARATRARR 


66 


Qy 


62 


AVQKRASPNSDDTVLSPQELQKVXCLVEMSEKPYILEIAALIALGNNAAYAFNRDIIRDLG 


121 






M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


67 


AVQKRASPNSDDTVLSPQELQKVLCLVEMSEKPYILEAALIALGNNAAYAFNRDIIRDLG 


126 


Qy 


122 


GLPI VAKI LNTRDPI VKEKALI VT^NLS WAENQRRLKVYMNQVCDDT IT S RLNS SVQLA 


181 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 




Db 


127 


GLPIVAKILNTRDPIVTCEKALIVTJWLSWAENQR 


186 


Qy 


182 


GLRLLTNMTWNEYQHMLANSISDFFRLFSAGNEETKLQVLKLLLNLAENPAMTRELLRA 


241 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II II 




Db 


187 


GLRLLTNMTVTNEYQHMLANSISDFFRLFSAGNEETKLQVXKLLLNLAENPAMTRELLRA 


246 


Qy 


242 


QVPSSLGSLFNKKENKEVILKLLVTFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 


301 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 




Db 


247 


QVPSSLGSLFNKKENKEVTLKLLVI FENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 


306 


Qy 


302 


ADKXLGI ESHHDFLVKVKVGKFMAKLAEHMFPKSQE 337 








III 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


307 


ADKVLGI ESHHDFLVKVKVGKFMAKLAEHMFPKSQE 342 





RESULT 2 
Q9UH62 



ID Q9UH62 PRELIMINARY; PRT; 379 AA. 

AC Q9UH62; Q9NPE4 ; 

DT 01-MAY-2000 (TrEMBLrel. 13, Created) 

DT 01-MAY-2000 (TrEMBLrel. 13, Last sequence update) 

DT 01-OCT-2004 (TrEMBLrel. 28, Last annotation update) 

DE Hypothetical protein (ALEX3 protein) . 

GN Name-ARMCX3 ; Synonyms=alex3 ; ORFNames=UNQ2517 ; 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 



OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Kidney; 

RA Nicolas G., Galand C, Lecomte M.-C; 

RL Submitted (DEC-1999) to the EMBL/ GenBank/ DDB J databases. 

RN [2] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Skin; 

RX MEDLINE=22388257; PubMed=12477932 ; 

RA Strausberg R.L., Feingold E.A. , Grouse L.H., Derge J.G., 

RA Klausner R.D., Collins F.S., Wagner L. , Shenmen CM., Schuler G.D., 

RA Altschul S.F., Zeeberg B., Buetow K.H., Schaefer C.F., Bhat N.K., 

RA Hopkins R.F., Jordan H., Moore T., Max S.I., Wang J., Hsieh F., 

RA Diatchenko L., Marusina K. , Farmer A. A. , Rubin G.M., Hong L., 

RA Stapleton M. , Soares M.B., Bonaldo M.F., Casavant T.L., Scheetz T . E . , 

RA Brownstein M.J., Usdin T.B., Toshiyuki S., Carninci P., Prange C, 

RA Raha S.S., Loquellano N.A. , Peters G.J., Abramson R.D., Mullahy S.J., 

RA Bosak S.A., McEwan P.J., McKernan K.J., Malek J. A., Gunaratne P.H., 

RA Richards S., Worley K.C., Hale S., Garcia A.M., Gay L.J., Hulyk S.W., 

RA Villalon D.K., Muzny D.M., Sodergren E.J., Lu X., Gibbs R.A., 

RA Fahey J., Helton E., Ketteman M., Madan A., Rodrigues S., Sanchez A., 

RA Whiting M. , Madan A., Young A.C., Shevchenko Y. f Bouffard G.G., 

RA Blakesley R.W., Touchman J.W., Green E.D., Dickson M.C., 

RA Rodriguez A.C., Grimwood J., Schmutz J., Myers R.M., Butter field Y.S., 

RA Krzywinski M.I., Skalska U., Smailus D.E., Schnerch A., Schein J.E., 

RA Jones S.J., Marra M.A. ; 

RT "Generation and initial analysis of more than 15,000 full-length human 

RT and mouse cDNA sequences."; 

RL Proc. Natl. Acad. Sci. U.S.A. 99:16899-16903(2002). 

RN [3] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Skin; 

RA Strausberg R. ; 

RL Submitted (MAR-2001) to the EMBL/ GenBank/ DDB J databases. 

RN [4] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=22887296; PubMed=12975309 ; 

RA Clark H.F., Gurney A.L., Abaya E., Baker K. , Baldwin D., Brush J., 

RA Chen J., Chow B., Chui C, Crowley C, Currell B., Deuel B., Dowd P., 

RA Eaton D., Foster J., Grimaldi C, Gu Q. , Hass P.E., Heldens S., 

RA Huang A., Kim H.S., Klimowski L., Jin Y. , Johnson S., Lee J., 

RA Lewis L., Liao D. , Mark M. , Robbie E . , Sanchez C, Schoenfeld J., 

RA Seshagiri S., Simmons L., Singh J., Smith V., Stinson J., Vagts A., 

RA Vandlen R. , Watanabe C, Wieand D., Woods K. , Xie M.H., Yansura D., 

RA Yi S., Yu G., Yuan J., Zhang M. , Zhang Z., Goddard A., Wood W.I., .. 

RA Godowski P.; 

RT "The secreted protein discovery initiative (SPDI) , a large-scale 

RT effort to identify novel human secreted and transmembrane proteins: a 

RT bioinf ormatics assessment."; 

RL Genome Res. 13:2265-2270(2003). 

RN [5] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Testis; 

RX MEDLINE=21092608; PubMed=11162520 ; 

RA Kurochkin I.V., Yonemitsu N., Funahashi S., Nomura H.; 

RT "ALEX1, a novel human armadillo- repeat protein that is expressed 



RT differentially in normal tissues and carcinomas* 11 ; 

RL Biochem. Biophys . Res. Commun. 280:340-347(2001). 

DR EMBL; AF211175; AAF24487.1; 

DR EMBL; BC005194; AAH05194.1; 

DR EMBL; AY359079; AAQ89438.1; -. 

DR EMBL; AB039669; BAA94602.1; 

DR InterPro; IPR008938; ARM. 

DR InterPro; IPR000225; Armadillo. 

DR InterPro; IPR006911; DUF634. 

DR Pfam; PF04826; DUF634; 1. 

DR PROSITE; PS50176; ARM_REPEAT; 1. 

KW Hypothetical protein. 

SQ SEQUENCE 379 AA; 42500 MW; B715D7F83DF4DFB0 CRC64; 

Query Match 99.6%; Score 1708; DB 2; Length 379; 

Best Local Similarity 99.7%; Pred. No. 4.5e-118; 

Matches 335; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 2 GDVT)DAGDCSGARYNDWSDDDDDSNESKSIWYPPWARIGTEAGTRARARARARATRARR 61 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 44 GDVDDAGDCSGARYNDWSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRARR 103 

Qy 62 AVQKRAS PNS DDTVLS PQELQKVLCLVEMS EKP YI LEAALI ALGNNAAYAFNRDI I RDLG 121 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 104 AVQKRAS PNS DDTVLS PQELQKVLCLVEMS EKP YI LEAALI ALGNNAAYAFNRDI I RDLG 163 

Qy 122 GLPIVAKILNTRDPIVTCEKALIVXNNLSWAENQRRLKW 181 

I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 164 GLPIVAKILNTRDPIVTCEKALIVXNNLSWAENQRRLK^ 223 

Qy 182 GLRLLTNMTWNEYQHMLANSISDFFRLFSAGNEETKLQVXKLLLNLAENPAMTRELLRA 241 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 224 GLRLLTNMTVTNEYQHMLANSISDFFRLFSAGNEETKLQVXKLLLNLAENPAMTRELLRA 283 

Qy 242 QVPSSLGSLFNKKENKEVILKLLVIFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 301 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 284 QVPSSLGSLFNKKENKEVTLKLLVTFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 343 

Qy 302 ADKXLGIESHHDFLVKVKVGKFMAKLAEHMFPKSQE 337 

III I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 
Db 344 ADKVXGIESHHDFLVKVKVGKFMAKLAEHMFPKSQE 379 

RESULT 3 
AAQ89438 

ID AAQ89438 PRELIMINARY; PRT; 379 AA. 

AC AAQ89438; 

DT 02-MAR-2004 (TrEMBLrel. 27, Created) 

DT 02-MAR-2004 (TrEMBLrel. 27, Last sequence update) 

DT 02-MAR-2004 (TrEMBLrel. 27, Last annotation update) 

DE ALEX3 . 

GN UNQ2517. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Primata; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 



RP SEQUENCE FROM N . A. 

RX PubMed=12975309; 

RA Clark H.F., Gurney A. L . , Abaya E., Baker K., Baldwin D., Brush J., 

RA Chen J., Chow B., Chui C. , Crowley C, Currell B., Deuel B., Dowd P., 

RA Eaton D., Foster J. , Grimaldi C, Gu Q., Hass P.E., Heldens S., 

RA Huang A., Kim H.S., Klimowski L., Jin Y., Johnson S., Lee J. , 

RA Lewis L., Liao D., Mark M. , Robbie E. , Sanchez C, Schoenfeld J. , 

RA Seshagiri S., Simmons L., Singh J., Smith V., Stinson J. , Vagts A., 

RA Vandlen R. , Watanabe C, Wieand D., Woods K. , Xie M.H., Yansura D., 

RA Yi S., Yu G M Yuan J., Zhang M. , Zhang Z., Goddard A. , Wood W.I., 

RA Godowski P.; 

RT "The Secreted Protein Discovery Initiative (SPDI) , a Large-Scale 

RT Effort to Identify Novel Human Secreted and Transmembrane Proteins: A 

RT Bioinformatics Assessment."; 

RL Genome Res. 13:2265-2270(2003). 

DR EMBL; AY359079; AAQ89438.1; -. 

SQ SEQUENCE 379 AA; 42500 MW; B715D7F83DF4DFB0 CRC64; 

Query Match 99.6%; Score 1708; DB 2; Length 379; 

Best Local Similarity 99.7%; Pred. No. 4.5e-118; 

Matches 335; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 2 GDVDDAGDCSGARYNDWSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRARR 61 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 44 GDVDDAGDCSGARYNDWSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRARR 103 

Qy 62 AVQKRAS PNS DDTVLS PQELQKVLCLVEMS EKP YI LEAALIALGNNAAYAFNRDI I RDLG 121 

I II I I II I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I 
Db 104 AVQKRAS PNS DDTVLS PQELQKVLCLVEMS EKP YI LEAALIALGNNAAYAFNRDI I RDLG 163 

Qy 122 GLPIVAKILNTRDPIVKEKALIVXNNLSWAENQRRLKVTMNQVCDDTITSRLNSSVQ^ 181 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 164 GLPIVAKILNTRDPIVT<EKALIVLNNLSWAENQRRLKVYMNQVCDDTITSRLNSSVQIA 223 

Qy 182 GLRLLTNMTVTNEYQHMLANSISDFFRLFSAGNEETKLQVXKLLI^ 241 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 224 GLRLLTNMTVTNEYQHMLANSISDFFRLFSAGNEETKLQVXKLLLNLAENPAMTRELLRA 283 

Qy 242 QVPSSLGSLFNKKENKEVILKLLVI FENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 301 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 284 QVPSSLGSLFNKKENKEVILKLLVI FENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 343 

Qy 302 ADKXLGIESHHDFLVKVKVGKFMAKLAEHMFPKSQE 337 

III I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 
Db 344 ADKVTiGIESHHDFLVKVKVGKFMAKLAEHMFPKSQE 379 



RESULT 4 
Q8BHS6 

ID Q8BHS6 PRELIMINARY ; PRT; 379 AA. 

AC Q8BHS6; 

DT 01-MAR-2003 (TrEMBLrel. 23, Created) 

DT 01-MAR-2003 (TrEMBLrel. 23, Last sequence update) 

DT 01-OCT-2004 (TrEMBLrel. 28, Last annotation update) 

DE Mus musculus 8 days embryo whole body cDNA, RIKEN full-length enriched 
DE library, clone : 5730422G06 product : HYPOTHETICAL 42.5 kDa PROTEIN 
DE (ALEX3) (ALEX 3 PROTEIN) homolog (ALEX 3 protein) . 



GN Name=1200004E24Rik; Synonyms=Armcx3; 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N - A. 

RC STRAIN=C57BL/6J; TISSUE=Whole body; 

RX MEDLINE=99279253; PubMed=10349636; 

RA Carninci P., Hayashizaki Y. ; 

RT "High-efficiency full-length cDNA cloning."; 

RL Meth. Enzymol. 303:19-44(1999). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Whole body; 

RX MEDLINE=21085660; PubMed=11217851 ; 

RA RIKEN FANTOM Consortium; 

RT "Functional annotation of a full-length mouse cDNA collection."; 

RL Nature 409:685-690(2001). 

RN [3] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Whole body; 

RA The FANTOM Consortium, 

RA the RIKEN Genome Exploration Research Group Phase I & II Team; 

RT "Analysis of the mouse transcriptome based on functional annotation of 

RT 60,770 full-length cDNAs . " ; 

RL Nature 420:563-573(2002). 

RN [4] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Whole body; 

RX MEDLINE=20499374; PubMed=11042159 ; 

RA Carninci P., Shibata Y., Hayatsu N., Sugahara Y., Shibata K. , Itoh M. , 

RA Konno H., Okazaki Y. , Muramatsu M. , Hayashizaki Y. ; 

RT "Normalization and subtraction of cap-trapper-selected cDNAs to 

RT prepare full-length cDNA libraries for rapid discovery of new genes."; 

RL Genome Res. 10:1617-1630(2000). 

RN [5] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Whole body; 

RX MEDLINE=20530913; PubMed=11076861 ; 

RA Shibata K., Itoh M. , Aizawa K. , Nagaoka S., Sasaki N., Carninci P., 

RA Konno H., Akiyama J., Nishi K. , Kitsunai T., Tashiro H., Itoh M. , 

RA Sumi N., Ishii Y., Nakamura S., Hazama M. , Nishine T., Harada A. , 

RA Yamamoto R. , Matsumoto H., Sakaguchi S., Ikegami T., Kashiwagi K. , 

RA Fujiwake S., Inoue K. , Togawa Y., Izawa M. , Ohara E. , Watahiki M. , 

RA Yoneda Y. , Ishikawa T., Ozawa K., Tanaka T., Matsuura S., Kawai J. , 

RA Okazaki Y., Muramatsu M. , Inoue Y., Kira A., Hayashizaki Y. ; 

RT "RIKEN integrated sequence analysis (RISA) system-384-f ormat 

RT sequencing pipeline with 384 multicapillary sequencer."; 

RL Genome Res. 10:1757-1771(2000). 

RN [6] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Whole body; 

RA Adachi J., Aizawa K. , Akimura T., Arakawa T., Bono H., Carninci P., 

RA Fukuda S., Furuno M., Hanagaki T., Hara A., Hashizume W., 

RA Hayashida K., Hayatsu N., Hiramoto K., Hiraoka T. f Hirozane T., 

RA Hori F., Imotani K., Ishii Y. , Itoh M. , Kagawa I., Kasukawa T., 



RA Katoh H., Kawai J., Kojima Y., Kondo S., Konno H., Kouda M. , Koya S., 

RA Kurihara C, Matsuyama T., Miyazaki A. , Murata M. , Nakamura M., 

RA Nishi K., Nomura K., Numazaki R. , Ohno M. , Ohsato N., Okazaki Y., 

RA Saito R., Saitoh H., Sakai C, Sakai K., Sakazume N., Sano H., 

RA Sasaki D., Shibata K. , Shinagawa A., Shiraki T., Sogabe Y., Tagami M. , 

RA Tagawa A. , Takahashi F. , Takaku-Akahira S., Takeda Y., Tanaka T., 

RA Tomaru A- , Toya T . , Yasunishi A., Muramatsu M. , Hayashizaki Y.; 

RL Submitted (JUL-2001) to the EMBL/ GenBank/DDBJ databases. 

RN [7] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Mix FVB/N; 

RC TISSUE=Mammary tumor. WAP-TGF alpha model. 7 months old; 

RX MEDLINE=22388257; PubMed=12477932 ; 

RA Strausberg R.L., Feingold E.A., Grouse L.H., Derge J.G., 

RA Klausner R.D., Collins F.S., Wagner L., Shenmen CM., Schuler G.D., 

RA Altschul S.F., Zeeberg B., Buetow K.H W Schaefer C.F., Bhat N.K., 

RA Hopkins R. F. , Jordan H., Moore T., Max S.I., Wang J., Hsieh F., 

RA Diatchenko L. f Marusina K., Farmer A. A. , Rubin G.M. , Hong L. , 

RA Stapleton M. , Soares M.B., Bonaldo M. F. , Casavant T.L., Scheetz T.E., 

RA Brownstein M.J., Usdin T.B., Toshiyuki S., Carninci P., Prange C, 

RA Raha S.S., Loquellano N.A., Peters G.J., Abramson R.D., Mullahy S.J., 

RA Bosak S.A., McEwan P.J. f McKernan K.J. f Malek J.A W Gunaratne P.H., 

RA Richards S., Worley K.C., Hale S. r Garcia A.M., Gay L.J., Hulyk S.W., 

RA Villalon D.K. f Muzny D.M., Sodergren E.J., Lu X., Gibbs R.A. # 

RA Fahey J., Helton E., Ketteman M. , Madan A., Rodrigues S., Sanchez A., 

RA Whiting M. , Madan A. , Young A.C., Shevchenko Y. , Bouffard G.G., 

RA Blakesley R.W., Touchman J.W., Green E.D., Dickson M.C., 

RA Rodriguez A.C., Grimwood J., Schmutz J., Myers R.M. , Butterfield Y.S., 

RA Krzywinski M.I., Skalska U., Smailus D.E., Schnerch A. , Schein J.E., 

RA Jones S.J., Marra M.A. ; 

RT "Generation and initial analysis of more than 15,000 full-length human 

RT and mouse cDNA sequences."; 

RL Proc. Natl. Acad. Sci. U.S.A. 99:16899-16903(2002). 

RN [8] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Mix FVB/N; 

RC TISSUE=Mammary tumor. WAP-TGF alpha model. 7 months old; 

RA Strausberg R. ; 

RL Submitted (APR-2003) to the EMBL/GenBank/DDBJ databases. 

DR EMBL; AK030729; BAC27102.1; 

DR EMBL; BC051113; AAH51113.1; -. 

DR MGD; MGI: 1918953; 1200004E24Rik . 

DR InterPro; IPR008938; ARM. 

DR InterPro; IPR000225; Armadillo. 

DR InterPro; IPR006911; DUF634 . 

DR Pfam; PF04826; DUF634; 1. 

DR PROSITE; PS50176; ARM__REPEAT ; 1. 

KW Hypothetical protein. 

SQ SEQUENCE 379 AA; 42619 MW; 6EA7B87544652055 CRC64; 

Query Match 95.9%; Score 1645; DB 2; Length 379; 

Best Local Similarity 95.8%; Pred. No. 2e-113; 

Matches 322; Conservative 5; Mismatches 9; Indels 0; Gaps 0; 



Qy 

Db 



2 GDVDDAGDCSGARYNDWSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRARR 61 
I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I II I I I II I I I I I I I I I 
44 GDVDDAGDCSGARYNDWSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRARR 103 



Qy 

Db 



62 AVQKRAS PNS DDTVLS PQELQKVLCLVEMSEKP YI LEAALI ALGNNAAYAFNRDI I RDLG 121 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
104 AVQKRAS PNS DDTVLS PQELQKVLCLVEMSEKP YI LEAALIALGNNAAYAFNRDI I RDLG 163 



Qy 122 GLP I VAKI LNTRDP IVKEKALI VLNNLS WAENQRRLKVYMNQVCDDTI TS RLNS S VQLA 181 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I 

Db 164 GLPIVAKILNTRDPIVKEKALIVLNNLSWAEN^ 223 

Qy 182 GLRLLTNMTVTNEYQHMLANSISDFFRLFSAGNEETKLQVLKLLLNLAENPAMTRELL^ 241 

I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 224 GLRLLTNMTWNEYQHILANSISDFFRLFSAGNEETKLQVLKLLLNLAENP 283 

Qy 242 QVPSSLGSLFNKKENKEVILKLLVIFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 301 

I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 

Db 284 QVPSSLGSLFNKKEYKEVTLKLLIIFENINDNFKWEENEPAQNHFSEGSLFFFLKEFQVC 343 

Qy 302 ADKXLGIESHHDFLVKVKVGKFMAKLAEHMFPKSQE 337 

III I I I I I I I I I : I I I I I I : I I I I II I I I I I 

Db 344 ADKVLGIESRHDFQVRVKVGKFVAKLTERMFPKSQE 379 

RESULT 5 
Q91VP8 

ID Q91VP8 PRELIMINARY; PRT; 379 AA. 

AC Q91VP8; 

DT 01-DEC-2001 (TrEMBLrel - 19, Created) 

DT 01-DEC-2001 (TrEMBLrel. 19, Last sequence update) 

DT 01-MAR-2004 (TrEMBLrel. 26, Last annotation update) 

DE ALEX 3 protein. 

GN N ame =Armcx 3 ; 

OS Mus mus cuius (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 

OX NCBIJTaxID=10090; 

RN [1] 

RP SEQUENCE FROM N . A. 

RC STRAIN=Mix FVB/N; 

RC TISSUE=Mammary tumor. WAP-TGF alpha model. 7 months old; 

RX MEDLINE=22388257; PubMed=12477932 ; 

RA Strausberg R.L., Feingold E.A. , Grouse L.H., Derge J.G., 

RA Klausner R.D., Collins F.S., Wagner L., Shenmen CM., Schuler G.D., 

RA Altschul S.F., Zeeberg B., Buetow K.H., Schaefer C.F., Bhat N.K., 

RA Hopkins R.F., Jordan H., Moore T., Max S.I., Wang J., Hsieh F., 

RA Diatchenko L., Marusina K., Farmer A. A. , Rubin G.M., Hong L., 

RA Stapleton M. , Soares M.B., Bonaldo M.F., Casavant T.L., Scheetz T.E., 

RA Brownstein M.J., Usdin T.B., Toshiyuki S., Carninci P., Prange C, 

RA Raha S.S., Loquellano N.A., Peters G.J., Abramson R.D., Mullahy S.J., 

RA Bosak S.A., McEwan P.J., McKernan K.J., Malek J. A., Gunaratne P.H., 

RA Richards S., Worley K.C., Hale S., Garcia A.M., Gay L.J., Hulyk S.W., 

RA Villalon D.K., Muzny D.M., Sodergren E.J., Lu X., Gibbs R.A., 

RA Fahey J., Helton E., Ketteman M., Madan A., Rodrigues S., Sanchez A., 

RA Whiting M., Madan A., Young A.C., Shevchenko Y., Bouffard G.G., 

RA Blakesley R.W., Touchman J.W., Green E.D., Dickson M.C., 

RA Rodriguez A.C., Grimwood J., Schmutz J., Myers R.M. , Butterfield Y.S., 

RA Krzywinski M.I., Skalska U., Smailus D.E., Schnerch A., Schein J.E., 

RA Jones S.J., Marra M.A. ; 



RT "Generation and initial analysis of more than 15,000 full-length human 

RT and mouse cDNA sequences."; 

RL Proc. Natl. Acad. Sci. U.S.A. 99:16899-16903(2002). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Mix FVB/N; 

RC TISSUE=Mammary tumor. WAP-TGF alpha model. 7 months old; 

RA Strausberg R. ; 

RL Submitted (JUL-2001) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL; BC011101; AAH11101.1; 

DR MGD; MGI: 1918953; 1200004E24Rik . 

DR InterPro; IPR008938; ARM. 

DR InterPro; IPR000225; Armadillo. 

DR InterPro; IPR006911; DUF634 . 

DR Pfam; PF04826; DUF634; 1. 

DR PROSITE; PS50176; ARM_RE P EAT ; 1. 

SQ SEQUENCE 379 AA; 42649 MW; CE1EC87045695156 CRC64 ; 

Query Match 95.9%; Score 1645; DB 2; Length 379; 

Best Local Similarity 95.8%; Pred. No. 2e-113; 

Matches 322; Conservative 5; Mismatches 9; Indels 0; Gaps 0; 



Qy 2 GDVDDAGDCSGARYNDWSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRARR 61 

I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 44 GDVDDAGDCSGARYNDWSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRARR 103 

Qy 62 AVQKRAS PNSDDTVLS PQELQKVLCLVEMSEKPYI LEAALIALGNNAAYAFNRDI I RDLG 121 

I I I I II I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I 
Db 104 AVQKRAS PNSDDTVLS PQELQKVLCLVEMSEKPYI LEAALIALGNNAAYAFNRDI I RDLG 163 



Qy 122 GLP I VAKI LNTRDP I VTCEKALI VXNNLS WAENQRRLKVYMNQVCDDT I TS RLNS SVQLA 181 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I 
Db 164 GLPIVAKILNTRDPIVTCEKALIVXNNLSWAENQRRLKVYMNQVCDDTVTSRLNSSVQLA 223 

Qy 182 GLRLLTNMTVTNEYQHMLANSISDFFRLFSAGNEETKLQVXKLLLNLAENPAMTRELLRA 241 

I I I I I I I I I I I I I I I I : I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 224 GLRLLTNMTVTNEYQHI LANS I SDFFRLFSAGNEETKLQVLKLLLNLAENPAMTRELLRA 283 

Qy 242 QVPSSLGSLFNKKENKEVILKLLVI FENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVC 301 

I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I II II I I II I I I I I I I I I I I 
Db 284 QVPSSLGSLFNKKEYKEVILKLLIIFENINDNFKWEENEPAQNHFSEGSLFFFLKEFQVC 343 



Qy 302 ADKXLGIESHHDFLVTCVKVGKFMAKLAEHMFPKSQE 337 

III Mill III I : I II I I I : I I I I I I I I I I I 
Db 344 ADKVLGIESRHDFQVRVKVGKFVAKLTERMFPKSQE 379 



RESULT 
Q9DC32 



ID 
AC 
DT 
DT 
DT 
DE 
DE 
DE 



Q9DC32 

Q9DC32; 

01-JUN-2001 

01-JUN-2001 

01-MAR-2004 

Mus mus cuius 



PRELIMINARY; 

(TrEMBLrel. 
(TrEMBLrel. 
(TrEMBLrel. 
adult male 



PRT; 



379 AA. 



clone: 1200004E24 product : HYPOTHETICAL 42 
PROTEIN) homolog. 



17, Created) 

17, Last sequence update) 
26, Last annotation update) 

lung cDNA, RIKEN full-length enriched library, 
5 kDa PROTEIN (ALEX3) (ALEX 3 



GN Name=1200004E24Rik; 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Lung; 

RX MEDLINE=99279253; PubMed=10349636; 

RA Carninci P., Hayashizaki Y. ; 

RT "High-efficiency full-length cDNA cloning."; 

RL Meth. Enzymol. 303:19-44(1999). 

RN. [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Lung; 

RX MEDLINE=21085660; PubMed=11217851 ; 

RA RIKEN FANTOM Consortium; 

RT "Functional annotation of a full-length mouse cDNA collection."; . 

RL Nature 409:685-690(2001). 

RN [3] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Lung; 

RA The FANTOM Consortium, 

RA the RIKEN Genome Exploration Research Group Phase I & II Team; 

RT "Analysis of the mouse transcriptome based on functional annotation of 

RT 60 , 770 full-length cDNAs . " ; 

RL Nature 420:563-573(2002). 

RN [4] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Lung; 

RX MEDLINE=20499374; PubMed=l 1042 159 ; 

RA Carninci P., Shibata Y., Hayatsu N., Sugahara Y., Shibata K., Itoh M., 

RA Konno H., Okazaki Y., Muramatsu M. , Hayashizaki Y. ; 

RT "Normalization and subtraction of cap-trapper-selected cDNAs to 

RT prepare full-length cDNA libraries for rapid discovery of new genes."; 

RL Genome Res. 10:1617-1630(2000). 

RN [5] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Lung; 

RX MEDLINE=20530913; PubMed=11076861; 

RA Shibata K., Itoh M. f Aizawa K., Nagaoka S., Sasaki N., Carninci P., 

RA Konno H., Akiyama J., Nishi K., Kitsunai T., Tashiro H., Itoh M. , 

RA Sumi N., Ishii Y., Nakamura S., Hazama M. , Nishine T., Harada A., 

RA Yamamoto R. , Matsumoto H., Sakaguchi S., Ikegami T., Kashiwagi K. , 

RA Fujiwake S., Inoue K. , Togawa Y., Izawa M., Ohara E., Watahiki M. , 

RA Yoneda Y., Ishikawa T., Ozawa K., Tanaka T., Matsuura S., Kawai J., 

RA Okazaki Y., Muramatsu M. , Inoue Y., Kira A., Hayashizaki Y.; 

RT "RIKEN integrated sequence analysis (RISA) system-38 4- format 

RT sequencing pipeline with 384 multicapillary sequencer."; 

RL Genome Res. 10:1757-1771(2000). 

RN [6] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Lung; 

RA Adachi J. , Aizawa K w Akahira S., Akimura T. f Arai A., Aono H., 

RA Arakawa T., Bono H. f Carninci P., Fukuda S., Fukunishi Y., Furuno M. , 

RA Hanagaki T., Hara A. , Hayatsu N. f Hiramoto K., Hiraoka T., Hori F., 

RA Imotani K., Ishii Y., Itoh M. , Izawa M. , Kasukawa T., Kato H. f 



RA Kawai J., Kojima Y., Konno H., Kouda M. , Koya S., Kurihara C, 

RA Matsuyama T., Miyazaki A., Nishi K., Nomura K. , Numazaki R. , Ohno M. , 

RA Okazaki Y., Okido T., Owa C, Saito H., Saito R. , Sakai C, Sakai K. , 

RA Sano H., Sasaki D., Shibata K., Shibata Y., Shinagawa A., Shiraki T., 

RA Sogabe Y., Suzuki H., Tagami M. , Tagawa A., Takahashi F. , Tanaka T., 

RA Tejima Y., Toya T., Yamamura T., Yasunishi A. , Yoshida K. , Yoshino M., 

RA Muramatsu M., Hayashizaki Y.; 

RL Submitted (JUL-2000) to the EMBL/ GenBank/ DDB J databases. 

DR EMBL; AK004598; BAB23399.1; -. 

DR MGD; MGI: 1918953; 1200004E24Rik . 

DR InterPro; IPR008938; ARM. 

DR InterPro; IPR000225; Armadillo. 

DR InterPro; IPR006911; DUF634 . 

DR Pfam; PF04826; DUF634; 1. 

DR PROSITE; PS50176; ARM_REPEAT; 1. 

KW Hypothetical protein. 

SQ SEQUENCE 379 AA; 42662 MW; F40A039CD6E4911F CRC64; 

Query Match 94.8%; Score 1625; DB 2; Length 379; 

Best Local Similarity 94.9%; Pred. No. 6.2e-112; 

Matches 319; Conservative 5; Mismatches 12; Indels 0; Gaps 0; 

Qy 2 GDVT)DAGDCSGARYNDWSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRARR 61 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I 
Db 44 GDVT)DAGDCPGARYNDWSDDDDDSYESKSIWYPPW7VRIGTEAGTRARARARARATRARR 103 

Qy 62 AVQKRAS PNS DDTVLS PQELQKVXCLVEMSEKP YI LEAALI ALGNNAAYAFNRDI I RDLG 121 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ! I I I I I I I I I I I I I I I I I I i I I I I 
Db 104 AVQKRAS PNS DDTVXS PQELQKVXCLVEMSEKP YI LEAALIALGNNAAYAFNRDI I RDLG 163 

Qy 122 GLPIVAKILNTRDPIVTCEKALIVXNNLSWAE^ 181 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I 
Db 164 GLPIVAKILNTRDPIV1<EKALIVXNNLSWAENQRRLKWMNQVCDDTWSRLNSSVQLA 223 

Qy 182 GLRLLTNMTVTNEYQHMLANSISDFFRLFSAGNEETKLQVXKLLLNLAENPAMTRELLRA 241 

I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I ! I I I I I I I I I I I II I I I I I I I I I I I I II I I 
Db 224 GLRLLTNMTVTNEYQHILANSISDFFRLFSAGNEETKLQVLKLLLNLAENPAMTRELLRA 283 

Qy 242 QVPS SLGS LFNKKENKEVT LKLLVT FENINDNFKWEENEPTQNQFGEGS LFFFLKEFQVC 301 

I I I I I I I I I I I I I I I I I I I I I I : I I I I II I I I I I I I II I II I III I I I I I I II I I 
Db 284 QVPSSLGSLFNKKEYKEVILKLLIIFENINDNFKWEENEPAQNHFSEGSPFFFLKEFQVC 343 

Qy 302 ADKXLGI ESHHDFLVKVKVGKFMAKLAEHMFPKSQE 337 

III I I I I I III I : I I I I I I : I I I I II I I I I I 
Db 344 ADKVLGIESRHDFQVRVKVGKFVAKLTERMFPKSQE 379 

RESULT 7 
Q9P291 

ID Q9P291 PRELIMINARY; PRT; 453 AA. 

AC Q9P291; 

DT 01-OCT-2000 (TrEMBLrel. 15, Created) 

DT 01-OCT-2000 (TrEMBLrel. 15, Last sequence update) 

DT 01-OCT-2004 (TrEMBLrel. 28, Last annotation update) 

DE ALEX1 (Armadillo repeat containing, X-linked 1) (Hypothetical protein 
DE FLJ90304). 

GN Name=alexl; Synonyms =ARMCX1 ; 



OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=21092608; PubMed=11162520; 

RA Kurochkin I.V., Yonemitsu N., Funahashi S., Nomura H.; 

RT "ALEXl, a novel human armadillo repeat protein that is expressed 

RT differentially in normal tissues and carcinomas."; 

RL Biochem. Biophys . Res. Commun. 280:340-347(2001). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Uterus; 

RX MEDLINE=22388257; PubMed=12477932 ; 

RA Strausberg R.L., Feingold E. A. , Grouse L.H., Derge J.G., 

RA Klausner R.D., Collins F.S., Wagner L. , Shenmen CM., Schuler G.D., 

RA Altschul S.F., Zeeberg B., Buetow K. H. , Schaefer C.F., Bhat N.K., 

RA Hopkins R.F., Jordan H., Moore T., Max S.I., Wang J. , Hsieh F., 

RA Diatchenko L., Marusina K. , Farmer A. A., Rubin G.M., Hong L., 

RA Stapleton M. , Soares M.B., Bonaldo M.F., Casavant T.L., Scheetz T.E., 

RA Brownstein M.J., Usdin T.B., Toshiyuki S., Carninci P., Prange C, 

RA Raha S.S., Loquellano N.A. , Peters G.J., Abramson R.D., Mullahy S.J., 

RA Bosak S.A., McEwan P.J., McKernan K.J., Malek J. A., Gunaratne P.H., 

RA Richards S., Worley K.C., Hale S., Garcia A.M., Gay L.J., Hulyk S.W., 

RA Villalon D.K., Muzny D.M., Sodergren E.J., Lu X. , Gibbs R. A. , 

RA Fahey J., Helton E., Ketteman M. , Madan A., Rodrigues S., Sanchez A., 

RA Whiting M. , Madan A., Young A.C., Shevchenko Y., Bouffard G.G., 

RA Blakesley R.W. f Touchman J.W., Green E.D., Dickson M.C., 

RA Rodriguez A.C., Grimwood J., Schmutz J., Myers R.M. , Butterfield Y.S., 

RA Krzywinski M.I., Skalska U., Smailus D.E., Schnerch A., Schein J.E., 

RA Jones S.J., Marra M.A. ; 

RT "Generation and initial analysis of more than 15,000 full-length human 

RT and mouse cDNA sequences."; 

RL Proc. Natl. Acad. Sci. U.S.A. 99:16899-16903(2002). 

RN [3] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Uterus; 

RA Strausberg R. ; 

RL Submitted (FEB-2001) to the EMBL/GenBank/DDBJ databases. 

RN [4] 

RP SEQUENCE FROM N.A. 

RA Isogai T., Ota T., Nishikawa T., Hayashi K. , Otsuki T., Sugiyama T., 

RA Suzuki Y., Nagai K. , Sugano S., Ishii S., Kawai-Hio Y., Saito K., 

RA Yamamoto J., Wakamatsu A., Nakamura Y., Kojima S., Nagahari K., 

RA Masuho Y., Ono T., Okano K. , Yoshikawa Y., Aotsuka S., Sasaki N., 

RA Hattori A., Okumura K., Iwayanagi T., Ninomiya K. ; 

RL Submitted (MAR-2002) to the EMBL/GenBank/DDBJ databases. 

DR EMBL; AB039670; BAA94603.1; -. 

DR EMBL; BC002691; AAH02691.1; -. 

DR EMBL; AK074785; BAC11208.1; -. 

DR PIR; JC7582; JC7582. 

DR InterPro; IPR008938; ARM. 

DR InterPro; IPR000225; Armadillo. 

DR InterPro; IPR006911; DUF634 . 

DR Pfam; PF04826; DUF634; 1. 

DR PROSITE; PS50176; ARM REPEAT; 1. 



SQ SEQUENCE 453 AA; 49180 MW; 01BED98EC3F64672 CRC64; 



Query Match 45.6%; Score 781.5; DB 2; Length 453; 

Best Local Similarity 52.3%; Pred. No. 2e-49; 

Matches 157; Conservative 56; Mismatches 74; Indels 13; Gaps 3; 



Qy 39 RIGTEAGTRA RARARARATRA RRAVQKRASPNSDDTVLSPQELQKVLCL 87 

11:1111 : : | | : : : | | | I I I I I : I I : I I I I I : 

Db 156 RSGSRAGGRASGKSKGKARSKSTRAPATTWPVRRG — KFNFPYKIDDILSAPDLQKVLNI 213 

Qy 88 VEMS EKP YI LEAALI ALGNNAAYAFNRDI I RDLGGLP I VAKI LNTRDP I VKEKALI VXNN 147 

: I : I : I I I I : I I I I I I I : I I : : I I : I I I : I I : I I : : I : I I I : : I I III 
Db 214 LERTNDPFIQEVALVTLGNNAAYSFNQNAIRELGGVPIIAKLIKTKDPIIREKTYNALNN 273 

Qy 148 LSWAENQRRLKVYMNQVCDDTITSRLNSSVQLAGLRL 207 

MINIM ::| |::|IMII: I I : I : I I : I I I I I I I I I I I I I I III 

Db 274 LSWAENQGKIKTYISQVCDDTMVCRLDSAVQMAGLRLLTNMTVTNHYQHLLSYSFPDFF 333 

Qy 208 RL FS AGNEETKLQVXKLLLNLAEN PAMTRELLRAQVP S S LGS L FNKKENKEVT LKLLVI F 267 

I II I I : I : : M : : I I I I I II I I I : : I I I I I I II I : : : I : : I : I : I 
Db 334 ALLFLGNHFTKIQIMKLIINFTENPAMTRELVSCKVPSELISLFNKEWDREILLNILTLF 393 

Qy 268 ENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVCADKXLGIESHHDFLVKVKVGKFMAKL 327 

I I I I I I I I :: : I I I I I II II I : : I : I : I I I I I I : M 

Db 394 ENINDNIKNEGLASSRKEFSRSSLFFLFKESGVCVT^KIKALANHNDLV^ 453 



RESULT 8 
Q9CX83 



ID Q9CX83 PRELIMINARY; PRT; 456 AA. 

AC Q9CX83; 

DT 01-JUN-2001 (TrEMBLrel. 17, Created) 

DT 01-JUN-2001 (TrEMBLrel. 17, Last sequence update) 

DT 01-OCT-2004 (TrEMBLrel. 28, Last annotation update) 

DE Mus musculus 12 days embryo head cDNA, RIKEN full-length enriched 

DE library, clone : 3010033109 product : similar to ALEX1 (ALEX1 PROTEIN) 

DE (3010033I09Rik protein) (Armadillo repeat containing, X-linked 

DE 1) . 

GN Name= 3010033109 Rik; S ynonyms =Armcx 1 ; 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Head; 

RX MEDLINE=99279253; PubMed=10349636; 

RA Carninci P., Hayashizaki Y. ; 

RT "High-efficiency full-length cDNA cloning."; 

RL Meth. Enzymol. 303:19-44(1999). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Head; 

RX MEDLINE=21085660; PubMed=11217851; 

RA RIKEN FANTOM Consortium; 

RT "Functional annotation of a full-length mouse cDNA collection."; 

RL Nature 409:685-690(2001). 



RN [3] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Head; 

RA The FANTOM Consortium, 

RA the RIKEN Genome Exploration Research Group Phase I & II Team; 

RT "Analysis of the mouse transcriptome based on functional annotation of 

RT 60,770 full-length cDNAs . " ; 

RL Nature 420:563-573(2002). 

RN [4] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Head; 

RX MEDLINE=20499374; PubMed=11042159; 

RA Carninci P., Shibata Y., Hayatsu N., Sugahara Y. , Shibata K. , Itoh M. , 

RA Konno H., Okazaki Y. , Muramatsu M. , Hayashizaki Y. ; 

RT "Normalization and subtraction of cap-trapper-selected cDNAs to 

RT prepare full-length cDNA libraries for rapid discovery of new genes."; 

RL Genome Res. 10:1617-1630(2000). 

RN [5] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Head; 

RX MEDLINE=20530913; PubMed=11076861; 

RA Shibata K., Itoh M. , Aizawa K. , Nagaoka S., Sasaki N., Carninci P., 

RA Konno H., Akiyama J., Nishi K. , Kitsunai T., Tashiro H., Itoh M. , 

RA Sumi N., Ishii Y., Nakamura S., Hazama M. , Nishine T., Harada A., 

RA Yamamoto R. , Matsumoto H., Sakaguchi S., Ikegami T., Kashiwagi K., 

RA Fujiwake S., Inoue K. , Togawa Y., Izawa M., Ohara E., Watahiki M. , 

RA Yoneda Y. , Ishikawa T., Ozawa K., Tanaka T., Matsuura S., Kawai J., 

RA Okazaki Y., Muramatsu M. , Inoue Y., Kira A. , Hayashizaki Y. ; 

RT "RIKEN integrated sequence analysis (RISA) system- 3 84 -format 

RT sequencing pipeline with 384 multicapillary sequencer."; 

RL Genome Res. 10:1757-1771(2000). 

RN [6] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE=Head; 

RA Adachi J., Aizawa K., Akahira S., Akimura T., Arai A. , Aono H., 

RA Arakawa T., Bono H., Carninci P.,. Fukuda S., Fukunishi Y. , Furuno M. , 

RA Hanagaki T., Hara A., Hayatsu N., Hiramoto K. , Hiraoka T., Hori F., 

RA Imotani K., Ishii Y., Itoh M. / Izawa M. , Kasukawa T., Kato H., 

RA Kawai J., Kojima Y., Konno H. f Kouda M. f Koya S., Kurihara C, 

RA Matsuyama T. f Miyazaki A. , Nishi K. , Nomura K., Numazaki R. , Ohno M. , 

RA Okazaki Y., Okido T., Owa C, Saito H., Saito R. , Sakai C, Sakai K. , 

RA Sano H., Sasaki D., Shibata K. , Shibata Y., Shinagawa A., Shiraki T., 

RA Sogabe Y. , Suzuki H., Tagami M. , Tagawa A., Takahashi F., Tanaka T., 

RA Tejima Y. , Toya T . , Yamamura T., Yasunishi A. , Yoshida K., Yoshino M. , 

RA Muramatsu M. , Hayashizaki Y. ; 

RL Submitted (AUG-2000) to the EMBL/GenBank/DDBJ databases. 

RN [7] 

RP SEQUENCE FROM N.A. 

RC STRAIN=FVB/N-3 f and C57BL/6; 

RC TISSUE=Brain, Eye, and 

RC Mammary tumor. MMTV- LTR/ 1 NT 3 model. 5 month old mouse. Taken by 

RC biopsy.; 

RX MEDLINE=22388257; PubMed=12477932 ; 

RA Strausberg R.L., Feingold E.A. , Grouse L.H., Derge J.G., 

RA Klausner R.D., Collins F.S., Wagner L., Shenmen CM., Schuler G. D. , 

RA Altschul S.F., Zeeberg B., Buetow K.H., Schaefer C.F., Bhat N.K., 

RA Hopkins R.F., Jordan H., Moore T., Max S.I., Wang J. , Hsieh F., 



RA Diatchenko L., Marusina K., Farmer A. A. , Rubin G.M., Hong L., 

RA Stapleton M. , Soares M.B., Bonaldo M.F., Casavant T.L., Scheetz T.E., 

RA Browns tein M.J., Usdin T.B., Toshiyuki S., Carninci P., Prange C, 

RA Raha S.S., Loquellano N.A., Peters G.J., Abramson R.D., Mullahy S.J./ 

RA Bosak S.A., McEwan P.J., McKernan K.J., Malek J. A., Gunaratne P.H., 

RA Richards S., Worley K.C., Hale S., Garcia A.M., Gay L.J., Hulyk S.W., 

RA Villalon D.K., Muzny D.M., Sodergren E.J., Lu X., Gibbs R.A. , 

RA Fahey J., Helton E. , Ketteman M. , Madan A., Rodrigues S., Sanchez A., 

RA Whiting M. , Madan A., Young A.C., Shevchenko Y., Bouffard G.G., 

RA Blakesley R.W., Touchman J.W., Green E.D., Dickson M.C., 

RA Rodriguez A.C., Grimwood J. , Schmutz J. , Myers R.M., Butterfield Y.S., 

RA Krzywinski M.I., Skalska U., Smailus D.E., Schnerch A. , Schein J.E., 

RA Jones S.J., Marra M.A. ; 

RT "Generation and initial analysis of more than 15,000 full-length human 

RT and mouse cDNA sequences."; 

RL Proc. Natl. Acad. Sci. U.S.A. 99:16899-16903(2002). 

RN [8] 

RP SEQUENCE FROM N.A. 

RC STRAIN=FVB/N-3; 

RC TISSUE=Mammary tumor. MMTV-LTR/ INT 3 model. 5 month old mouse. Taken by 

RC biopsy.; 

RA Strausberg R. ; 

RL Submitted (JAN-2002) to the EMBL/ GenBank/ DDB J databases. 

RN [9] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Eye; 

RA Strausberg R. ; 

RL Submitted (APR-2002) to the EMBL/ GenBank/ DDB J databases. 

RN [10] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6; TISSUE=Brain; 

RA Strausberg R. ; 

RL Submitted (MAR-2004) to the EMBL/ GenBank/ DDB J databases. 

DR EMBL; AK019405; BAB31705.1; -. 

DR EMBL; BC021410; AAH21410.1; -. 

DR EMBL; BC026488; AAH26488.1; -. 

DR EMBL; BC068228; AAH68228.1; -. 

DR MGD; MGI : 1925498; 3010033I09Rik . 

DR InterPro; IPR008938; ARM. 

DR InterPro; IPR000225; Armadillo. 

DR InterPro; IPR006911; DUF634 . 

DR Pfam; PF04826; DUF634; 1. 

DR PROSITE; PS50176; ARM_REPEAT; 1. 

SQ SEQUENCE 456 AA; 50643 MW; 159EFAEF1B536406 CRC64; 

Query Match 44.7%; Score 766.5; DB 2; Length. 4 56; 

Best Local Similarity 51.0%; Pred. No. 2.6e-48; 

Matches 151; Conservative 51; Mismatches 81; Indels 13; Gaps 1; 

Qy 45 GT RARARARARAT RARRAVQKRAS PNSDDTVLSPQELQKVLCLVEMS 91 

I : I I I I : I I : : I I I : I I : I I I I I : : I : 

Db 161 GSRARNRTSGKVTCRKNRSKSNKAPATAWPWKGKFSFPYKIDDILSAPDLQKVLNILERT 220 

Qy 92 EKP YI LEAALIALGNNAAYAFNRDI I RDLGGLPI VAKI LNTRDPI VKEKALI VXNNLSVN 151 

I : I II: I I I I I I I : I I : : I I : I I I : I I : I I : : I I I I I : : I I I I II I I I 
Db 221 NDPFTQEVALVTLGNNAAYS FNQNAI RELGGVPI IAKLI KTRDPI I REKT YNALNNLSVN 280 



Qy 152 AENQRRLKVYMNQVCDDT^ 211 

: I I I : : I I : : I I I I I I : I I : I : I I : I I I I I I I I I I I I I I I I : I : I III I 
Db 281 SENQGKIKTYISQVCDDTMVCRLDSAVQMAGLR^ 340 

Qy 212 AGNEETKLQVLKLLLNLAENPAWTRE 271 

II 11:1 :||::| III III III: : I I I I I I I I I : : : I : : I : I : I I I I I 
Db 341 LGNHFTKIQTMKLIINFTE^PAMTRELVSCKVPSELISLFNKEWDREILLNILTLFENIN 400 

Qy 272 DNFKWEENEPTQNQFGEGSLFFFLKEFQVCADKXLGIESHHDFLVKVKVGKFMAKL 327 

II I I :: :| I I I I II II I : I I I : I I I I I I : I I 
Db 401 DNIKSEGLASSRKEFSRSSLFFLFKESGVCVKKIKALASHKDLVVKVKVLKVLTKL 456 



RESULT 9 
AAH68228 

ID AAH68228 PRELIMINARY; PRT; 456 AA. 

AC AAH68228; 

DT 01-JUN-2004 (TrEMBLrel. 27, Created) 

DT 01-JUN-2004 (TrEMBLrel. 27, Last sequence update) 

DT 01-JUN-2004 (TrEMBLrel. 27, Last annotation update) 

DE RIKEN cDNA 3010033109. 

GN 3010033I09RIK. 

OS Mus mus cuius (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N . A. 

RC STRAIN=C57BL/6; TISSUE=Brain; 

RX MEDLINE=22388257; PubMed=12477932 ; 

RA Strausberg R.L., Feingold E.A., Grouse L.H., Derge J.G., 

RA Klausner R.D., Collins F.S., Wagner L., Shenmen CM., Schuler G.D., 

RA Altschul S.F., Zeeberg B. , Buetow K.H., Schaefer C.F., Bhat N.K., 

RA Hopkins R.F., Jordan H., Moore T., Max S.I., Wang J., Hsieh F., 

RA Diatchenko L., Marusina K., Farmer A. A. , Rubin G.M., Hong L., 

RA Stapleton M. , Soares M.B., Bonaldo M.F., Casavant T.L., Scheetz T.E., 

RA Brownstein M.J., Usdin T.B., Toshiyuki S., Carninci P., Prange C, 

RA Raha S.S., Loquellano N.A., Peters G.J., Abramson R.D., Mullahy S.J., 

RA Bosak S.A., McEwan P.J., McKernan K.J., Malek J. A., Gunaratne P.H., 

RA Richards S., Worley K.C., Hale S., Garcia A.M., Gay L.J., Hulyk S.W., 

RA Villalon D.K., Muzny D.M., Sodergren E.J., Lu X., Gibbs R.A. , 

RA Fahey J., Helton E., Ketteman M. , Madan A., Rodrigues S. f Sanchez A., 

RA Whiting M. , Madan A., Young A.C., Shevchenko Y. , Bouffard G.G., 

RA Blakesley R.W., Touchman J.W., Green E.D., Dickson M.C., 

RA Rodriguez A.C., Grimwood J., Schmutz J., Myers R.M. , Butterfield Y.S., 

RA Krzywinski M.I., Skalska^.U., Smailus D.E., Schnerch A., Schein J.E., 

RA Jones S.J., Marra M.A. ; 

RT "Generation and initial analysis of more than 15,000 full-length human 

RT and mouse cDNA sequences."; 

RL Proc. Natl. Acad. Sci. U.S.A. 99:16899-16903(2002). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6; TISSUE=Brain; 

RA Strausberg R. ; 

RL Submitted (MAR-2004) to the EMBL/ GenBank/ DDB J databases. 

DR EMBL; BC068228; AAH68228.1; -. 

SQ SEQUENCE 456 AA; 50643 MW; 159EFAEF1B536406 CRC64; 



Query Match 44.7%; Score 766.5; DB 2; Length 456; 

Best Local Similarity 51.0%; Pred. No. 2.6e-48; 

Matches 151; Conservative 51; Mismatches 81; Indels 13; Gaps 



1; 



Qy 45 GTRARARARARATRARRAVQKRAS PNSDDTVLSPQELQKVLCLVEMS 91 

I : I I I I : I I: :| I I : I I : I I I I I : : I : 

Db 161 GSRARNRTSGKVTCRKNRSKSNKAPATAWPWKGKFSFPYKIDDILSAPDLQKVLNILERT 220 

Qy 92 EKPYILElAALIALGNNAAYAFNRDIIRDLGGLPIVAKILNTRDPIWEKALIVIiNNLSW 151 

I: I II: 1111111:11:: I I : I I I : I I : I I : : lllll::|| lllllll 
Db 221 NDPFTQEVALVTLGNNAAYSFNQNAI RELGGVPIIAKLIKTRDPIIREKTYNALNNLSVN 280 

Qy 152 AENQRRLKWMNQVCDDTITSRLNSSVQI^GLRLLTNMTW 211 

: I I I : : I I : : I I I I I I : I I : I : I I : I I I I I I I I I I I I I I I I : I : I III I 
Db 281 SENQGKIJCTYISQVCDDTMVCRLDSAVQMAGLRLLTNMTVTNHYQHLLSYSFPDFFALLF 340 

Qy 212 AGNEETKLQVLKLLLNLAENPAMTRELLRAQVPSSLGSLFNKKENKEVTLKLLVTFENIN 271 

II 11:1 :M::| I Mill 111: : I I I I Mill: ::|::| :| =11111 
Db 341 LGNHFTKIQTMKLIINFTENPAMTRELVSCKVPSELISLFNKEWDREILLNILTLFENIN 400 

Qy 272 DNFKWEE^EPTQNQFGEGSLFFFLKEFQVCADKXLGIESHHDFLVICVKVGKFMAKL 327 

I I I I : : : I I I I I II M I : M I : I M I I I : M 

Db 401 DNIKSEGIiA3SRKEFSRSSLFFLFKESGVCvl<KIKALASHKDLVVKvTCVXJCVLTKL 456 



RESULT 10 
Q7L311 

ID Q7L311 PRELIMINARY; PRT; 632 AA. 

AC Q7L311; 

DT 05-JUL-2004 " (TrEMBLrel. 27, Created) 

DT 05-JUL-2004 (TrEMBLrel. 27, Last sequence update) 

DT 01-OCT-2004 (TrEMBLrel. 28, Last annotation update) 

DE ALEX2 protein. 

GN N ame = ARMCX 2 ; 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Placenta, and Skin; 

RX MEDLINE=22388257; PubMed=12477932 ; 

RA Strausberg R.L., Feingold E.A. , Grouse L.H., Derge J.G., 

RA Klausner R.D., Collins F.S., Wagner L. , Shenmen CM., Schuler G.D., 

,RA Altschul S.F., Zeeberg B., Buetow K.H., Schaefer C.F., Bhat N.K., 

RA Hopkins R.F., Jordan H., Moore T., Max S.I., Wang J., Hsieh F., 

RA Diatchenko L. , Marusina K., Farmer A. A. , Rubin G.M., Hong L., 

RA Stapleton M., Soares M.B., Bonaldo M.F., Casavant T.L., Scheetz T.E., 

RA Browns tein M.J., Usdin T.B., Toshiyuki S., Carninci P., Prange C, 

RA Raha S.S., Loquellano N.A., Peters G.J., Abramson R.D., Mullahy S.J., 

RA Bosak S.A. , McEwan P. J., McKernan K.J., Malek J. A., Gunaratne P.H., 

RA Richards S., Worley K.C., Hale S., Garcia A.M., Gay L.J., Hulyk S.W., 

RA Villalon D.K., Muzny D.M., Sodergren E.J., Lu X., Gibbs R.A. , 

RA Fahey J., Helton E., Ketteman M., Madan A., Rodrigues S., Sanchez A., 

RA Whiting M. , Madan A., Young A.C., Shevchenko Y., Bouffard G.G., 

RA Blakesley R.W., Touchman J.W., Green E.D., Dickson M.C., 



RA Rodriguez A.C., Grimwood J., Schmutz J., Myers R.M. , Butterfield Y.S., 

RA Krzywinski M.I., Skalska U . , Smailus D.E., Schnerch A. , Schein J.E., 

RA Jones S.J., Marra M.A.; 

RT "Generation and initial analysis of more than 15,000 full-length human 

RT and mouse cDNA sequences."; 

RL Proc. Natl. Acad. Sci. U.S.A. 99:16899-16903(2002). 

RN [2] 

RP SEQUENCE FROM N. A. 

RC TISSUE=Skin; 

RA Strausberg R. ; 

RL Submitted (OCT-2001) to the EMBL/ GenBank/DDBJ databases. 

RN [3] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Placenta; 

RA Strausberg R. ; 

RL Submitted (AUG-2001) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL; BC015926; AAH15926.1; -. 

DR EMBL; BC012541; AAH12541.1; -. 

DR InterPro; IPR008938; ARM. 

DR InterPro; IPR000225; Armadillo. 

DR InterPro; IPR006911; DUF634 . 

DR Pfam; PF04826; DUF634; 1. 

DR SMART; SM00185; ARM; 1. 

SQ SEQUENCE 632 AA; 65683 MW; 7627CBFFE61C329B CRC64; 



Query Match 39.8%; Score 683; DB 2; Length 632; 

Best Local Similarity 45.3%; Pred. No. 5.9e-42; 

Matches 140; Conservative 56; Mismatches 91; Indels 22; Gaps 3; 

Qy 18 WSDDDDDSNESKSIWYPPWARIGTEAGTRARARARARATRARRAVQKRASPNSDDTVLS 77 

I : I : I I : : I hill hill I I : I 

Db 345 WTDTESDSD SEPETQRRGRGRRPV — AMQKRPFPYEIDEILG 384 

Qy 78 PQELQKVLCLVEMSEKPYILEAALIALGNNAAYAFNRDIIRDLGGLPIVAKILNTRDPIV 137 

: : I : I I I I : : I : I : I : I I : I I II I : I : : I I I I I I I I : I : : I I I : 
Db 385 WDLRKVLALLQKSDDPFIQQV7U^LTLSNNANYSCNQETIRKLGGLPIIANMINKTDPHI 444 

Qy 138 KEKALIVTiNNLSWAENQRRLKWMNQVCDDTITSRLN 197 

Mill: : I I I I I III I I : I I I I : I I I : I I I I : I I : I I : I I I I I : I I : I I I 
Db 445 KEKALMAMNNLSENYENQGRLQWMNKVMDDIMASNLNSAVQW 504 

Qy 198 MLANSISDFFRLFSAGNEETKLQVLKLLLNLAENPAMTRELLRAQVPSSLGSLFNKKENK 257 

: I I I I : : I I I I I I : I : : : I I : I I I I I I I : : I I I I I : I I I : I 
Db 505 LLVN S I AN FFRLL S QGGGKI KVEI LKI LSN FAEN P DMLKKLLS TQVPAS FS S L YN S YVES 564 

Qy 258 EVI.LKLLVI FENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVCADKXLGIESHHDFLVK 317 

I : : : I : I I I I I : I : I : I I I I : II I :: I I I I I I 

Db 565 EI LINALTLFEI I YDNLRAEVF — NYREFNKGSLFYLCTTSGVCVKKIRALANHHDLLVK 622 

Qy 318 VKVGKFMAK 326 

II I I : I 

Db 623 VKVIKLVNK 631 



RESULT 11 
060267 

ID 060267 PRELIMINARY; PRT; 710 AA. 



AC 060267; 

DT 01-AUG-1998 (TrEMBLrel . 07, Created) 

DT 01-AUG-1998 (TrEMBLrel. 07, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE KIAA0512 protein (Fragment) . 

GN Name=KIAA0512; 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBIJTaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Brain; 

RX MEDLINE=98290545; PubMed=9628581 ; 

RA Nagase T., Ishikawa K., Miyajima N., Tanaka A., Kotani H., Nomura N-, 

RA Ohara O. ; 

RT "Prediction of the coding sequences of unidentified human genes. IX. 

RT The complete sequences of 100 new cDNA clones from brain which can 

RT code for large proteins in vitro."; 

RL DNA Res. 5:31-39(1998). 

DR EMBL; AB011084; BAA25438.2; -. 

DR PIR; T00084; T00084. 

DR InterPro; IPR008938; ARM. 

DR InterPro; IPR000225; Armadillo. 

DR InterPro; IPR006911; DUF634 

DR Pfam; PF04826; DUF634; 1. 

DR SMART; SM00185; ARM; 1. 

FT NONJTER 1 1 

SQ SEQUENCE 710 AA; 74240 MW; DCC56E38A038D780 CRC64; 



Query Match 39.8%; Score 683; DB 2; Length 710; 

Best Local Similarity 45.3%; Pred. No. 6.9e-42; 

Matches 140; Conservative 56; Mismatches 91; Indels 22; Gaps 3; 

Qy 18 WS DDDDDSNESKS I VWYP PWARI GTEAGTRARARARARATRARRAVQKRAS PNS DDTVLS 77 

I : I : I I : : I I : I I I hill I I : I 

Db 423 WTDTESDSD SEPETQRRGRGRRPV AMQKRPFPYEIDEILG 462 

Qy 78 PQELQKVLCLVEMSEKPYILEAALIALGNNAAYAFNRDIIRDLGGLPIVAKILNTRDPIV 137 

::|:MI |:: |: |:| : II: I III I: |:: II 111111*1 ::| II : 
Db 463 WDLRKVXALLQKSDDPFIQQVALLTLSNNANYSCNQETIRKLGGLPIIANMINKTDPHI 522 

Qy 138 KEKALIVXNNLSWAENQRRLKVYMNQVCDDTITSRLNSSVQLAGLRLLTNMTW 197 

Mill: : I I I I I III I I : I I I I : I II : I I I I : I I : I I : I I I I I : I I : II I 
Db 523 KEKAIMAMNNLSENYENQGRLQVYMNK^ 582 

Qy 198 MLANSISDFFRLFSAGNEETKLQVLKLLLNLAENPAMTRELLRAQVPSSLGSLFNKKENK 257 

: I I I I :: I I I I I I : I : : : I I : I I I I I I I : : I I I I I : I I I : I 
Db 583 LLVNSIANFFRLLSQGGGKIKVEILKILSNFAENPDMLKKLLSTQVPASFSSLYNSYVES 642 

Qy 258 EVILKLLVIFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVCADKXLGIESHHDFLVK 317 

I : : : I : I I I I I : I : I : I I I I : II I : : I I I I I I 

Db 643 EILINALTLFEIIYDNLRAEVF — NYREFNKGSLFYLCTTSGVCVKKIRALANHHDLLVK 700 

Qy 318 VKVGKFMAK 326 

I I I I : I 
Db 701 VKVIKLVNK 709 



RESULT 12 
BAA25438 



ID BAA25438 PRELIMINARY; PRT; 710 AA. 

AC BAA25438; 

DT 02-MAR-2004 (TrEMBLrel. 27, Created) 

DT 02-MAR-2004 (TrEMBLrel. 27, Last sequence update) 

DT 02-MAR-2004 (TrEMBLrel. 21, Last annotation update) 

DE KIAA0512 protein (Fragment) . 

GN KIAA0512. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Brain; 

RX MEDLINE=98290545; PubMed=9628581 ; 

RA Nagase T., Ishikawa K., Miyajima N., Tanaka A., Kotani H., Nomura N. , 

RA Ohara O. ; 

RT "Prediction of the coding sequences of unidentified human genes. IX. 

RT The complete sequences of 100 new cDNA clones from brain which can 

RT code for large proteins in vitro."; 

RL DNA Res. 5:31-39(1998). 

DR EMBL; AB011084; BAA25438.2; 

FT NONJFER 1 1 

SQ SEQUENCE 710 AA; 74240 MW; DCC56E38A038D780 CRC64; 



Query Match 39.8%; Score 683; DB 2; Length 710; 

Best Local Similarity 45.3%; Pred. No. 6.9e-42; 

Matches 140; Conservative 56; Mismatches 91; Indels 22; Gaps 3; 

Qy 18 WSDDDDDSNESKSIVWYPPWARIGTEAGTRARARARARATRARRAVQKRASPNSDDTVLS 77 

I : I : I I : : I hill hill I I : I 

Db 423 WTDTESDSD S EP ETQRRGRGRRPV AMQKRPFPYEIDEILG 462 

Qy 78 PQELQKVTXLV^SEKPYILEAALIALGraA^ 137 

::|:IM |:: |: |:| : II: I III |: |:: II llllll:| ::| II : 
Db 463 V^DLRKVTiALLQKSDDPFIQQVALLTLSNNANYSCNQETIRKLGGLPIIANMINKTDPHI 522 

Qy 138 KEKALIVXNNLSWAENQRRLKVTMNQVCDDTITSRLNSSVQl^GLRLLTNMTVTNEYQH 197 

Mill: : I I I I I III I I : II I I : I II : I I I I : I I : I I : I I I I I : I I : I I I 
Db 523 KEKALMAMNNLSENYENQGRLQVbfMNKVMDDIMAS^ 582 

Qy ,., ..198 MLANSISDFFRLFSAGNEETKLQVLKLLLNLAENPAMTRELLRAQVPSSLGSLFNKKENK 257 

: I III : : I I I I I I : I : : : I I : I I I I I I I : : I I I I I : I I I : I 
Db 583 LLVNSIANFFRLLSQGGGKI KVEI LKI LSNFAENPDMLKKLLSTQVPAS FSSLYNS YVES 642 

Qy 258 EVILKLLVTFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVCADKXLGIESHHDFLVK 317 

I : : : I : I I I I I : I : I : I I I I : I I I : : I I I I I I 

Db 643 EILINALTLFEIIYDNLRAEVF — NYREFNKGSLFYLCTTSGVCVKKIRALANHHDLLVK 700 

Qy 318 VKVGKFMAK 326 

I I I I : I 
Db 701 VKVI KLVNK 709 



RESULT 13 
Q9BTM6 

ID Q9BTM6 PRELIMINARY; PRT; 308 AA. 

AC Q9BTM6; 

DT 01-JUN-2001 (TrEMBLrel. 17, Created) 

DT 01-JUN-2001 (TrEMBLrel. 17, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE SVH protein. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N. A. 

RC TISSUE=Eye; 

RX MEDLINE=22388257; PubMed=12477932 ; 

RA Strausberg R.L., Feingold E. A. , Grouse L.H., Derge J.G., 

RA Klausner R.D., Collins F.S., Wagner L., Shenmen CM., Schuler G. D. , 

RA Altschul S.F., Zeeberg B. , Buetow K.H., Schaefer C.F., Bhat N . K. , 

RA Hopkins R.F., Jordan H. , Moore T., Max S.I., Wang J., Hsieh F., 

RA Diatchenko L., Marusina K., Farmer A. A., Rubin G.M., Hong L., 

RA Stapleton M. , Soares M.B., Bonaldo M.F., Casavant T.L., Scheetz T.E., 

RA Brownstein M.J., Usdin T.B., Toshiyuki S., Carninci P., Prange C, 

RA Raha S.S., Loquellano N. A. , Peters G.J., Abramson R.D., Mullahy S.J., 

RA Bosak S.A. , McEwan P. J., McKernan K.J., Malek J. A., Gunaratne P.H., 

RA Richards S., Worley K.C. , Hale S., Garcia A.M. , Gay L.J., Hulyk S.W., 

RA Villalon D.K., Muzny D .M. , Sodergren E.J., Lu X., Gibbs R.A. , 

RA Fahey J., Helton E. , Ketteman M. , Madan A. , Rodrigues S., Sanchez A. , 

RA Whiting M., Madan A. , Young A.C., Shevchenko Y., Bouffard G.G., 

RA Blakesley R.W., Touchman J.W. , Green E.D., Dickson M.C., 

RA Rodriguez A.C., Grimwood J. , Schmutz J., Myers R.M., Butterfield Y.S., 

RA Krzywinski M.I., Skalska U., Smailus D.E., Schnerch A., Schein J.E., 

RA Jones S.J., Marra M.A. ; 

RT "Generation and initial analysis of more than 15,000 full-length human 

RT and mouse cDNA sequences."; 

RL Proc. Natl. Acad. Sci. U.S.A. 99:16899-16903(2002). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Eye; 

RA Strausberg R. ; 

RL Submitted (FEB-2001) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL; BC003586; AAH03586.1; -. 

DR InterPro; IPR008938; ARM. 

DR InterPro; IPR006911; DUF634 . 

DR Pfam; PF04826; DUF634; 1 

SQ SEQUENCE 308 AA; 33840 MW; B25A6C926BA5C101 CRC64; 

Query Match 39.2%; Score 671.5; DB 2; Length 308; 

Best Local Similarity 49.0%; Pred. No. 1.6e-41; 

Matches 148; Conservative 50; Mismatches 91; Indels 13; Gaps 4; 

Qy 37 WARIGTEAG TRARARA-RARATRARRAVQKRASPNSDDTVLSPQELQKVL 85 

I I I I I I I I I : : : : : I I I I : : : I I I : I 

Db 9 WVAAGLLLGAGACYCIYRLTRGRRRGDRELGIRSSKSAEDLTDGSYDD-VLNAEQLQKLL 67 



Qy 



86 CLVEMSEKPYI LEAALIALGNNAAYAFNRDI I RDLGGLPI VAKI LNTRDPIVKEKALIVL 145 



I : I : I I I : I I I I I I I I I I :: I : I I I : I I I : I I I I : I : : I I I I I I 
Db 68 YLLESTEDPVI IERALITLGNNAAFSVNQAI I RELGGI PIVANKINHSNQS I KEKALNAL 127 

Qy 146 NNLSWAENQRRLKVYMNQVCDDTITSRLNSSVQIAGLRLLTNMT^ 205 

II III I III ::|:|::|||:| : 111 = 111111 I I I I I I I I I :: I I I I : |:| 
Db 128 NNLSVNVENQI KI KI YI SQVCEDVFSGSLNSAVQLAGLTLLT^^4TVTNDHQHMLHS YITD 187 

Qy 206 FFRLFSAGNEETKLQVLKLLLNLAENPAMTRELLRAQVPSSLGSLFNKKENKEVILKLLV 265 

I : : II I I : I I I I I I I I I : I I I I I I I I I I I I I I I I : : I I : : I : : I 
Db 188 LFQVLLTGNGNTKVQVLKLLLNLSENPAlvrrEGLLRAQVDSSFLSLYDSHVAKEILLRVLT 247 

Qy 266 I FENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVCADKXLGI ESHHDFLVKVKVGKFMA 325 

: I : I I : I I : I I I I I I I I I : I I I : III II II : 
Db 248 LFQNI KNCLKI EGHLAVQPTFTEGSLFFLL-HGEECAQKI RALVDHHDAEVKEKWTI I P 306 

Qy 326 KL 327 

I : 

Db 307 KI 308 



RESULT 14 
Q8IZC1 

ID Q8IZC1 PRELIMINARY; PRT; 308 AA. 

AC Q8IZC1; 

DT 01-MAR-2003 (TrEMBLrel. 23, Created) 

DT 01-MAR-2003 (TrEMBLrel. 23, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE SVH-B. 

GN Name=SVH; 

OS Homo sapiens (Human) .- 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Huang R. , Xing Z., Luan Z., Wu T., Wu X., Hu G.; 

RL Submitted (SEP-2002) to the EMBL/ GenBank/DDB J databases. 

DR EMBL; AY150853; AAN72315.1; -. 

DR InterPro; IPR008938; ARM. 

DR InterPro; IPR006911; DUF634. 

DR Pfam; PF04826; DUF634; 1. 

SQ SEQUENCE 308 AA; 33850 MW; 9F74718D8B96C102 CRC64; 

Query Match 39.1%; Score 670.5; DB 2; Length 308; 

Best Local Similarity 49.0%; Pred. No. 1.9e-41; 

Matches 148; Conservative 50; Mismatches... 91; Indels 13; Gaps 4; 

Qy 37 WARIGTEAG TRARARA-RARATRARRAVQKRASPNSDDTVLSPQELQKVL 85 

I I I I I I I I I : : : : : I I I I : : : I I I : I 

Db 9 WVAAGLLLGAGACYCIYRLTRGRRRGDRELGIRSSKSAEDLTDGSYDD-VLNAEQLQKLL 67 

Qy 86 CLV^SEKPYILEAALIALGNNAAYAFNRDIIRDLGGLPIVAKILNTRDPIWEKALIVL 145 

I : I : I I I : I I I I I I I I I I : : ! : I I I : I I I I I I I : I : : I I I I I I 
Db 68 YLLESTEDPVI IERALITLGNNAAFSVNQAI I RELGGI PIVANKINHSNQS I KEKALNAL 127 



Qy 



146 NNLSWAENQRRLKWMNQVCDDTITSRLNSSVQLAGLRLLTNMTWNEYQHMLANSISD 205 
II I I I I III : : I : I : : I I I : | : I I I : I I I I I I I I I I I I I I I : : I I I I : I : I 



Db 128 NNLSVNVENQIKIKIYISQVCEDVFSGPLNSAVQLAGLTLLThfMTVTNDHQHMLHSYITD 187 

Qy 206 FFRLFSAGNEETKLQVLKLLLNLAENPAMTRELLRAQVPSSLGSLFNKKENKEVTLKLLV 265 

I : : II I I : I I I I I I I I I : I I I I I I I I I I I I I I I I : : I I :: I : : I 
Db 188 LFQVLLTGNGNTKVQVLKLLLNLSENPAMTEGLLRAQVDSSFLSLYDSHVAKEILLRVLT 247 

Qy 266 IFENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVCADKXLGIESHHDFLVKVKVGKFMA 325 

: I : I I : II: I I I I I I I I I : I I I : III I I I I : 
Db 248 LFQNIKNCLKIEGHLAVQPTFTEGSLFFLL-HGEECAQKIRALVDHHDAEVKEKVVTIIP 306 

Qy 326 KL 327 

I : 

Db 307 KI 308 



RESULT 15 
Q8BJ82 

ID Q8BJ82 PRELIMINARY; PRT; 784 AA. 

AC Q8BJ82; 

DT 01-MAR-2003 (TrEMBLrel. 23, Created) 

DT 01-MAR-2003 (TrEMBLrel. 23, Last sequence update) 

DT 01-MAR-2004 (TrEMBLrel. 26, Last annotation update) 

DE Mus musculus 12 days embryo embryonic body between diaphragm region 

DE and neck cDNA, RIKEN full-length enriched library, clone : 9430015G17 

DE product: weakly similar to KIAA0512 PROTEIN (ARMADILLO REPEAT PROTEIN 

DE ALEX2) (SIMILAR TO ARMADILLO REPEAT PROTEIN ALEX2) . 

GN Name=3230401N03Rik; 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; 

RC TISSUE= Embryonic body between diaphragm region and neck; 

RX MEDLINE=99279253; PubMed=10349636; 

RA Carninci P., Hayashizaki Y. ; 

RT "High-efficiency full-length cDNA cloning."; 

RL Meth. Enzymol. 303:19-44(1999). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; 

RC TISSUE= Embryonic body between diaphragm region and neck; 

RX MEDLINE=21085660; PubMed=11217851 ; 

RA RIKEN FANTOM Consortium; 

RT "Functional annotation of a full-length mouse cDNA collection.";..- . 

RL Nature 409:685-690(2001). 

RN [3] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; 

RC TISSUE= Embryonic body between diaphragm region and neck; 

RA The FANTOM Consortium, 

RA the RIKEN Genome Exploration Research Group Phase I & II Team; 

RT "Analysis of the mouse transcriptome based on functional annotation of 

RT 60,770 full-length cDNAs . " ; 

RL Nature 420:563-573(2002). 

RN [4] 



RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; 

RC TISSUE=Embryonic body between diaphragm region and neck; 

RX MEDLINE=20499374; PubMed=11042159; 

RA Carninci P . , Shibata Y., Hayatsu N., Sugahara Y., Shibata K., Itoh M. , 

RA Konno H., Okazaki Y., Muramatsu M., Hayashizaki Y.; 

RT "Normalization and subtraction of cap-trapper-selected cDNAs to 

RT prepare full-length cDNA libraries for rapid discovery of new genes."; 

RL Genome Res. 10:1617-1630(2000). 

RN [5] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; 

RC TISSUE= Embryonic body between diaphragm region and neck; 

RX MEDLINE=20530913; PubMed=11076861 ; 

RA Shibata K. , Itoh M. , Aizawa K. , Nagaoka S., Sasaki N., Carninci P., 

RA Konno H., Akiyama J., Nishi K. , Kitsunai T., Tashiro H., Itoh M. , 

RA Sumi N., Ishii Y. , Nakamura S., Hazama M. , Nishine T., Harada A. , 

RA Yamamoto R., Matsumoto H. , Sakaguchi S., Ikegami T., Kashiwagi K. f 

RA Fujiwake S., Inoue K. , Togawa Y., Izawa M. , Ohara E., Watahiki M. , 

RA Yoneda Y., Ishikawa T., Ozawa K., Tanaka T., Matsuura S., Kawai J., 
RA , Okazaki Y., Muramatsu M. , Inoue Y., Kira A. , Hayashizaki Y. ; 

RT "RIKEN integrated sequence analysis (RISA) system-38 4- format 

RT sequencing pipeline with 384 multicapillary sequencer."; 

RL Genome Res. 10:1757-1771(2000). 

RN [6] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; 

RC TISSUE=Embryonic body between diaphragm region and neck; 

RA Adachi J. , Aizawa K. , Akimura T., Arakawa T., Bono H., Carninci P., 

RA Fukuda S., Furuno M. , Hanagaki T. , Hara A. , Hashizume W., 

RA Hayashida K., Hayatsu N., Hiramoto K., Hiraoka T., Hirozane T., 

RA Hori F., Imotani K. , Ishii Y., Itoh M. , Kagawa I., Kasukawa T., 

RA Katoh H., Kawai J., Kojima Y., Kondo S-, Konno H., Kouda M. , Koya S. f 

RA Kurihara C, Matsuyama T., Miyazaki A., Murata M. , Nakamura M. f 

RA Nishi K., Nomura K. , Numazaki R. , Ohno M., Ohsato N. f Okazaki Y., 

RA Saito R., Saitoh H. f Sakai C, Sakai K., Sakazume N., Sano H., 

RA Sasaki D. , Shibata K. , Shinagawa A. , Shiraki T., Sogabe Y., Tagami M. , 

RA Tagawa A., Takahashi F. , Takaku-Akahira S., Takeda Y., Tanaka T., 

RA Tomaru A. , Toya T., Yasunishi A., Muramatsu M. , Hayashizaki Y. ; 

RL Submitted (JUL-2001) to the EMBL/ GenBank/ DDB J databases. 

DR EMBL; AK034621; BAC28774.1; -. 

DR MGD; MGI : 1914666; 3230401N03Rik. 

DR InterPro; IPR008938; ARM. 

DR InterPro; IPR006911; DUF634. 

DR Pfam; PF04826; DUF634; 1. 

SQ SEQUENCE 784 AA; 80987 MW; 2E724F1036F55B7A CRC64; 

Query Match 38.7%; Score 664; DB 2; Length 784; 

Best Local Similarity 44.0%; Pred. No. 2e-40; 

Matches 136; Conservative 56; Mismatches 95; Indels 22; Gaps 3; 

Qy 18 WSDDDDDSNESKSIWYPPWARIGTEAGTRARARARARATRARRAVQKRASPNSDDTVLS 77 

I : I : I I : : I : : I : I I I I : I 

Db 497 WTDTESDSDSEPDV PQRGKGKRT IPMHKRPFPYEIDEILG 536 

Qy 78 PQELQKVLCLVEMS EKPYI LEAALIALGNNAAYAFNRDI IRDLGGLPI VAKI LNTRDPI V 137 

:: I : I I I I : : I : hi : I I : I III I : I : : II I I II I I : I : : I II : 



Db 



537 VRDLRKVLALLQKSDDPFIQQVALLTLSNNANYSCNQETIRKLGGLPIIANMINKTDPHI 596 



Qy 138 KEKALIVLNNLSWAENQRRLKVYMNQV^ 197 

Mill: : I I I I I III I I : I I I I : I II : I I I I : I I : I I : I I I I I : I I : I I I 
Db 597 KE KALMAMNN L S EN YENQG RLQVYMN KVMD D I MASN LN S AVQ WGL K FLTNMT I TN D YQH 656 

Qy 198 MLANSISDFFRLFSAGNEETKLQVLKLLLNI^ 257 

: I I I I :: I I I I I I : I : : : I I : I I I I I I I : : I I Mill MM 
Db 657 LLVNS IANFFRLLSQGGGKIKVEI LKI LSNFAENPDMLKKLLGTQVPSS FS SLYNS YVES 716 

Qy 258 EVILKLLVI FENINDNFKWEENEPTQNQFGEGSLFFFLKEFQVCADKXLGIESHHDFLVK 317 

I : : : I : I I I M : I M Mill: II I : M II I I I 

Db 717 EI LINALTLFEI I FDNLRAEVF — NNREFNKGSLFYLCTTSGVCVKKIRALANHHDLLVK 774 

Qy 318 VKVGKFMAK 326 

I I I I : I 
Db 775 VKVIKLVNK 783 



Search completed: January 7, 2005, 14:50:53 
Job time : 72.2668 sees 



